anonusr

Members
  • Content count

    7
  • Joined

  • Last visited

About anonusr

  • Rank
    Virgin

Profile Information

  1. anonusr

    Server upgrade is complete!

    Aw... If you don't mind me asking, was my explanation still work for why the browsers were reacting differently at least correct (regardless of root cause)? I thought I had that pretty well. Regardless, glad it got fixed!
  2. anonusr

    Server upgrade is complete!

    So - I think I figured out what's causing this (because i was curious, and I decided to call up developer mode on a few browsers). And, if I'm right (and someone can feel free to comment if I'm not), it's actually kind of interesting. When you access a story on the archive right now, the story spits back out a HTTP status code of 302 - Found (Elsewhere). This is used by sites to indicate to a browser that the thing they wanted (in this case, the story), is located somewhere else on the server, but that place might change. For example, it's like saying "you looked on the top shelf for this thing. I move where I keep this thing, and you can find it in the closet now, but I might move it around again later, so go back to the top shelf whenever you need it, and I'll tell you where the thing is at that time." When it issues the 302, it's supposed to give you the location of the thing you're looking for. Here's the weird thing - the HTTP standard declares that place it go to as a 'SHOULD' in the standard - the server should give it to you, but it doesn't actually have to (if it had to, it would be 'SHALL' in the standard). Right now, the 302 being generated by AFF doesn't have a location. So - what happens if you don't get a location with your 302? It depends on the browser. Chrome will ignore the redirect, and simply load the page with the payload of the redirect (redirects can have a page with them so, if the redirect fails, you get a webpage that says "hey - go here"). In our case of AFF, the payload of the redirect is actually the story, so chrome works fine. Firefox will try to redirect the page itself. So, to go back to the shelf example, it looks at the top shelf, sees a redirect with no where to go, and then looks at the top shelf again. And again. And again. Eventually, firefox figures out something is wrong, and stops looking for the content (and shows you an error). IE will interpret it as a 'return to root folder' - so if you requested http://games.adult-fanfiction.org/story.php?no=[numberhere], it will redirect to http://games.adult-fanfiction.org. So yeah - chrome will work for now. Also, if you have a way to suppress redirects from AFF, it might also work, but using chrome is probably easier for the moment.
  3. Well that would explain the deletion - that you for clarifying that. Still - if anyone does stumble upon this topic, and thinks I'm thinking of a different work (not My Master), please let me know (I feel like the work I was thinking of was updated within the past year, so it might not have been My Master, then).
  4. There was a fic (I believe "My Master", which was located here: http://ff.adult-fanfiction.org/story.php?no=35509) where Cloud has a few off hand meetings with Sepheroth as a Cadet, and he can here voices inside his head calling him a puppet. The pairing was Seph/Cloud. Was this "My Master", and if so, does anyone know where to find it? If not, does this sound familiar to anyone? Thanks for any help with this. -anonusr
  5. anonusr

    Archive Format Problems and Broken HTML

    Sorry for the delayed reply - here's the code I used. Depending if fix_lines_global is set, the code will either just fix the spans, or will remove the <br> tags as well. If fix_lines is not set, this should not have a negative impact on any story (including one that is fixed), since it should only clear the bad unicode and spans. Upon looking at it again, this code really isn't efficient (it does several passes of replace), but, as I mentioned, I suspect it might be better than going over stuff by hand. Regardless, the code is free for you to use if you want it. Regards, anonusr PS: Python is a whitespace sensitive language, and the forum text editor seems to be removing all the indentation when I post. If you need a version with correct indentation, I can email you a copy, or I can give you a link to the code stored elsewhere. I've tried to mark in this post where if statements begin and end. import re def clean_story(html_page): #Clean the story #This page is unicode - use unicode strings #Also, ignore any unicode bytes that are corrupted. clean_page = unicode(html_page,'UTF-8',errors='ignore') #First, replace all &gt with > (so later substitutions #work as expected). clean_page = clean_page.replace('>','>').replace(' ','') #remove \r (so we only have standard \n linebreaks) clean_page = clean_page.replace('\r','') #Though not techniacally correct, replace all <br /> with <br> #so the page is consistent. clean_page = re.sub(r'<br />',r'<br>',clean_page) #next, fix all problems, with broken spans clean_page = re.sub(r'<span<br>','<span',clean_page) #Sometimes stories are broken into proper <p> tages, #whereas other times they are broken by <br>. It's #impossible to know when the <br> help, and when they #hurt. Therefore, this can be turned on with fix_lines_global if fix_lines_global: #Less aggressive version only removes line breaks if they are stand alone #clean_page = re.sub(r'(?<!<br>\n)<br>(?!\n<br>)','',clean_page) #more aggressive version will still remove single line breaks, but, #if there are multiple line breaks, this will take off one. #we are using this one because stories that have to be reflowed #also seem to have extra space. clean_page = re.sub(r'<br>(?!\n<br>)','',clean_page) #if fix_lines_global ends here #Remove extra breaks from a page (regardless if fix_lines_global is set) clean_page = re.sub(r'(<br>\n){3,}',r'<br>',clean_page) return clean_page Edit: Added comments to denote where blocks end, and added import statement
  6. anonusr

    Archive Format Problems and Broken HTML

    Thank you for the links, but I'm somewhat confused... I'm aware an issue exists, and I know it was brought on during the deployment of the RTE editor. However, it looks like your referring to a different bug than I am (I'd assume wall of text is a lack of breaks rather than breaks appearing in the wrong place). Regardless, as DemonGoddess061 mentioned, the fixes have been being applied manually. The primary point of my post was to ask if you wanted my code to clean up the stories that were having problems with the broken span bug in a more automated fashion (assuming you can directly modify html text for stories in your database and can run a script against that html). That's all.
  7. Hello, Since the new version of the archive was deployed, I noticed many older stories have broken html tags that gets mixed into the story text (the usual culprit is a broken span tag which take looks like <span<br>\r\n which causes the style attribute (often mso-spacerun:yes) to show up in story text. Here's an example of a fic that has this problem (it's not the only one, but just one I stumbled upon): http://anime.adultfa...hp?no=600042685 Additionally, these stories also often have line break problems - because text is divided into paragraphs tags <p>, and also has additional unusual line breaks tags <br>, the text appears to have strange breaks mid-paragraph and large amounts of white space between paragraphs due to the combination of both <br> and the end of paragraph. Finally - there are broken unicode whitespace characters in the text (usually as part of an mso-spacerun span that otherwise appears empty), that, though they do not show up in a browser, they do manifest themselves in html code. I've written a tool to download a story and clean up the bad HTML (I was having trouble reading some of the works on site). The problem is, it's difficult for me to automatically figure out which stories have <br> problems, and which ones don't (many stories have the entire story in a single span or paragraph, and use <br> tags to break correctly. Whereas others have the undesired behavior. Therefore the tool isn't perfect - though it will clean the dangling html in story text, you have to tell it to clean the <br> tags explicitly (I tried to make the tool function so that it can only help, and won't hurt an already well formatted story). I was wondering, would the archive like a copy of my code (it's written in python2)? I'm not sure how widespread the problem is, but I know I've seen it on quite a few stories, and I thought it might be helpful for someone. Regards, anonusr