Jump to content

Click Here!

Cleanup update


DemonGoddess

Recommended Posts

I've finally finished the wallotext fix in books subdomain.

Some interesting things I've found along the way...

  • There are NINE different types of text encoding in the data inserts. There should only be one.
  • Word and plain text interpret the php line break differently. Word sees that as a SPACE.

Anyway, in some of the stories, because I worked with the data as it was, and didn't add anything to it; one might see some where there is not an additional line between the paragraphs. If there was only 1 line break, and not 2, that's what I got. However, the line breaks are there, so I can always go back at a later time and add that.

In progress-

Wallotext fix - now in buffy

Buffy - chapter data repair, review board repair

  • chapter data repair
  • BtVS - General - complete
  • BtVS - Round Robins - complete
  • BtVS - FemmeSlash - in progress
    Finished- Cordelia/Faith, Dawn/Faith, Faith/Willow, Tara/Willow
    in progress - Buffy/Willow

cartoons - story movement from anime to cartoon, specifically Transformers fanfiction, date/time stamp repair, review board repair

  • Transformers: Animated - moved, fixed, sorted
  • Transformers - G1 - repair and move in progress

celeb - sort in progress

Link to comment
Share on other sites

  • 2 weeks later...

Regarding chapter data repair:

I thought I'd give an example of each for instance. This is with the addition of the rich text editor, and how it showed code (often), screwed up spacing (100% of the time), changed fonts throughout, (often) and other such things.

This particular story that I just fixed is 29 chapters long. BEFORE stripping out all the garbage from the word exported html, The collective size of the thing was 8MB. Once I was done fixing it, it shrank to 1.5MB.

This is before I fixed things

beforerte.PNG

this is after

afterrte.PNG

another example of the rte stuff to fix-

rtebefore.PNG

rteafter.PNG

As you can see, there is quite a difference.

These are the steps I have to take to fix these particular stories:

  • Check the text encoding. If it's not universal, convert it.
  • REMOVE all the extra html that makes illegal function calls, is just sort of ...there... and what not. (This is done line by line by line by line...)
  • convert the php line break to a space, as for these records, that's what the line break inserted actually is, not a paragraph end or line break.
  • Double check each record, make sure I missed nothing

Mind you, it doesn't SOUND like a lot, but consider that a simple paragraph open container, which is NORMALLY 3 characters (<p>), can often be upwards of 50 characters. Not only that, there are EXTRA paragraphs added, that the nonexistent .css file actually references for formatting. I run across this with each html element in the document. This is why this particular story shrunk so drastically. The sheer file size because of the extra, useless html embedded within each record.

What causes the appearance of normally "invisible" coding elements, or the oddball spacing, and what not; is that the rich text editor attempts to strip all this garbage out. It can only do a partial of this, until a user actually opens the chapter and clicks "edit chapter" to resave it. It then finishes stripping out most of it. The only thing I've seen it have difficulty with removing are the extra paragraphs Word likes to insert in the converted document.

So, what would take a user maybe a minute or two a chapter (unless they want to make additional changes) by simply resaving it, takes ME roughly two to four hours per chapter, depending upon just how much garbage was inserted in to the record. The more crap code, the longer it takes to fix it.

Which is why I am grateful to the users who've taken the time and not waited for me, and gone ahead and FIXED their stories where it was a word file exported to html. Seriously. Those of you who've already done this have saved me untold hours of work.

Link to comment
Share on other sites

  • 1 month later...

Finished Transformers story moves.

In anime - only anime Transformer stories to be added. Those titles are:

  • Transformers: Headmasters
  • Transformers: Robots in Disguise
  • Transformers: Unicron Trilogy (Armada, Energon, and Cybertron)

In comics - Transformers - IDW published

In Marvel 'verse - Transformers (The Transformers, Transformers Universe, and a few other titles which escape me at the moment)

In Movies - all Transformers live action movies

In cartoon -

  • G1
  • Beast Wars
  • Animated
  • Prime
Link to comment
Share on other sites

  • 2 weeks later...
  • 2 weeks later...

Something which the users wouldn't see, except perhaps a slow down of accessing data. Cleaned up the leftovers from story deletions. Whether done by mods in the past to delete underagers, or deleted by the users themselves.

The story deletion function would delete the chapters and the story, but leave the rates and reviews.

So, cleaned up all those leftovers which are attached to absolutely nothing (10+ years of it) so that it is now something that I'd do as routine maintenance every week or two.

Link to comment
Share on other sites

×
×
  • Create New...