How is web form content encoded in Safari's cache?
2014-07
A friend of mine just accidentally navigated off a web page where she entered a rather lengthy text into a web form. When navigating back to the page, the text was lost.
(Spoiler: The page had an auto-save feature and we were actually able to recover most of the text. Still I'm curious to find out whether it is possible to recover the text from the cache.)
She was using Safari and using grep -r "[a rather unique word she remembered]" ~/Library/Caches/com.apple.Safari
I was able to find the title of her text in the com.apple.Safari/Cache.db
. Now it seems as if the text is also there, but possibly it needs to be converted into a different format.
Does anybody know how I can find out how to convert it to a readable format? Or is what I think is the text something completely different?
Below is the relevant part of the cache: "bibliotheken-mehr-als-ein-verstaubtes-buechergestell" is the title, the german sentence after the "wp_autosave" means that the text was autosaved at 10.27 (That's actually how we found out that it still must be on her account of the page) and I assume everything after the "IHDRÛ" on line 7 must be the text.
‎<span id="edit-slug-buttons"><a href="#post_name" class="edit-slug button button-small hide-if-no-js" onclick="editPermalink(1844); return false;">Bearbeiten</a></span>
<span id="editable-post-name-full">bibliotheken-mehr-als-ein-verstaubtes-buechergestell</span>
<span id='view-post-btn'><a href='http://www.medienkompetenz-im-digitalen-zeitalter.ch/?p=1844' class='button button-small'>Beitrag ansehen</a></span>
Å>àø Ç~{"wp-refresh-post-lock":{"new_lock":"1402741629:62"},"wp_autosave":{"success":true,"message":"Entwurf wurde um 10:27:09 Uhr gespeichert."},"wp-auth-check":true,"server_time":1402741629}hàøÅR{"wp-refresh-post-lock":{"new_lock":"1402741644:62"},"wp-auth-check":true,"server_time":1402741644}Ü+àøåXâPNG
IHDRÛˇasBIT|dà›IDAT8ç}ìMlÜüŒÏœLw∑€-•Ì‚¶;ÖÜ@"E\™ñz@ç.@§˛ )I„¡Ù‰≈{3= $DM/*
jå÷à‡°¶-ÑJ(›Ú≥Ï∂Ëvv¶vª3Ì˛çlå,º«7Ô˜‰KfiÔ©†æææw#ëàbYñ_”¥T•Ãø*ô≈b—€››˝√Y”4M$Á{{{7U V˝fl–4ÌœEÀzÛÍ’+ÃÃLS(˚˝(-λf∆∫tµøè«≥\˚›ªü§Qi«_SãUvRX\ ˝xÑ\F≈F0˚˚˚=œÜÜÜbS±âuãv¬ä-¯ú ‹MÃ":¿DÚÍ©‰m“˙||`` ;|Ïà[r›4O3_"ù+\Èe<9œ¬¢I›Ë ÒïQ6FPS…¿[ÌȱÎc7߉<bó$´˘À–QK6EIdÕ+~Í´ó¯‡˚£\iË@ùÀÒ`^f≠“L}∞˛‡rQl]r3eOÕ22£·ê™ÿ˚Ò7Ï˙Ú¶ª>cœˆ6B^ôyS £Îz+Ä¿ÎÛI.ó√ÄW>vn
†Ñ˝ú8Û.û£∞˙
oÕ"ô"sjñ›m>R©î¥ºA(RÛ˘≠·jÛ‘r:_˝4 ÜÌ;±e≠©°∂±ñmJĸRûlv!Ωêey$õÕ±µEbXíț˄«;3\÷„r\Kd÷T#‚Ê–Î
‹ãMa∆Ëk¨ÎÈÈ—?<}ä/¶™òú≥»xÍ %µÃ6B´ó∑óN∂îÿΩˇw∆«#@\|∞á<ıQ«ßáfi«„Òqø˃pÀ∂•I‚xcô}ÕáOúb‚ÓΩœÛ˘•oüªÎpsÛ˘ŒŒN˚∑_~∂≥∫jkF∆÷ü>µµøßÌKÉflŸØE£ˆ™¶¶Ø_˙¿‡LXQ⁄∂lfiå€Ì&˘‰ •Ri<69˘…ú™^Æ0ÛBEňó˛i*Û¥>∑QIENDÆB`Çáàøé(âPNG
IHDRÛˇasBIT|dàEIDAT8çM”Àk\UÄÒÔú{Ô<2ØNf&ìƒv£b¥A£A4ÿ¶u°v„F§vc≠HÍVƒE‘ÖP‘XàP\T–°h¥±§Bõî46â6MÛòf2ȺÔ˚∏(Bø?‡∑˚ÑRJÒzÎ80ñı˚«xÏ·©Ò¡Far`"÷fl†E2◊åh€R¨ΩÁƒ˜áÂéÑRJÕ p>7˙¡…ß’F∫.]ÆP˘vÄıBô≥øÃ≤puÖ˝Ê≈flK?¬«’sí∏˛@≤ÀÁ…Îfl>™çt%ª|l€c£ÊR®:XËD⁄Z⁄%<}˘ôSÖ»Ô∑ø ã)cØLhdIΩW¿¥|“œ73fiõ¢Æ"›¿Ìχ¡vÜŒ˝„Áìmr¶Iıc“◊I\€«´K<v˜7≥ªøôGêØ∏8öÅ≠Ö85ì ›ù≈»ÑTÌ∑§ß›à˙ÄFe*l3Kzfl´ú˝Ω»⁄W7)ï=6W¯∫¡ÏÇM©‰rW_ä’ï S∫Ú„H€h;sË∆,F6ñã®Í-™´´xæ∆_W$aÈ!ì!˛∏·vÎ*ËÒAÛ∑∏x¶N·À∏A¿õ€fl%<y]◊f€”Ë‹Vg[º»≠F€ru’Euk≈¡±=ª¬db>±º—2ÑÖ9Vf8ÚVµÜ€®±g˚uÿ*b/ÂâÑE]ÓâÛÛ˙Ç
@4!IÁ‹”‡t€ß¸ñ:¬›—2ʃi<Waóx≠˜
÷˙õó6Èâ˚3…òY≤kWo#πæ—lÑZÕdˆoãÈ©MÜRCºbèÒıs§4ãÚ¸KãíL™„'°”¥∏"Œ_´{ªiX
À4ò˙≥≠1öDç\ãCggê:kãÍv∏º´ß„~°îRä'~ò'CZ†%˜@àXBÇéîˬ«1}\Saä∫Ù…ØõÊ¿yòÌ—€‡äóGß≈H:,3âÄN2Æ!§@©<œ°–PT-ÃÑ·º3p¬˚@ˇˇ*Ì3Ú{
fl◊Óöùd“±2†ÄRÈ.flµÔ˝o~µ˝9uÁçBà,êÓ{÷}Ìöx(¢â4 Jû∫πÏ©πIGMs@^)﯀ {ı˛zøIENDÆB`ÇÖjàøãVâPNG
IHDRÛˇasBIT|dàúIDAT8ç•í;hiÜüˇüôsôìâ&f≈†ÉocTÖ≥`ÏLcg•€lëbµmD0ÖïÿXEAm‘ ®†ª(´ETåpuwçûìèì8ìsôÀˇˇbÉïo˘¡˜ÚÚ¿F|{v˙{ƒj/Çg¥fvv.RQsÃY\Í<=˙滀 Ø[èzm≥ºÉ…Ÿh!¿àSdöa’c¢∞1Ï?xth »Ø;*7µ¨KéDÊìf`∫ï`RÖ€RÑ≥≠·ûÍãCÊ7ˇÁ®u°ÁˆGî‘ËF’
¬vÄ5Û
ï–qBúf∏9˚áeßN2⁄;Ø#Áó∞ÒÃÀ#ÖE¬ÙŸÀ<kÛ묢◊}çˇ~BvP()‹¢¶#ÕFËgPCπ®·‹K '^R;sÅ√œã‹∏5˛û√7äº[vèŒE3Ïì)Ñíe¿ Ÿ©‚Ò∂.ûmyõváo˘òXÕV∆µó=¨_ï«$∆8ÿÆÂÕæLY 4◊Ù“◊»òõ‘c!`.UDıà<–a ˚<üàÉ$R µ‡U◊}kÎ<û¸¿ƒL¿Î ‡a•∆Œfi)àC¿A ©œÿ—|›bÃîÿÅÑ¢nc¥våK{˜r˚n~Ÿ\§œÏBBhLÍí&π±y@ıE0¥¢ˇß ª
,iÒ_XÂD∏üï˝ÀâöÇ’µ?)v‘P %ãd‚2~ß{™ü=ò˛ªvo˜sécaêñƒëy¢4&nF_ÛîLïH.VÍ1U-o⁄˜‰‚WëÄâ´ˇˇµ§ÏÁÑëBåÜ ëÛ˛•KX4#’(1Y…
Ø⁄ÛœBïÁ≥eµ_fiΩrDI €+ÔHgÖwëå≤Lå=º‰
ÌΩˇf¡œè‰°;(~ÿL>‡IENDÆB`Ç∂ Ñ·»zıg%óZáÁ”må0
e
3I]
è ¡ÔÖ ≤7‘r
I can view http://hosting2.phor.net/~bolomi/ and save this to a file. In Chrome, I can view source and see the title of the page is 波羅蜜. If I open the saved file in vim to edit, I see question marks for those characters. Then if I :set encoding=gb2312
I will see the first two characters and then question marks. If, instead, I do :e ++enc=gb2312 %:p
I will see the middle character.
After the file is open (because I use MacVim and Fetch opens the file for me automatically) what is the correct workflow to edit this file?
gb2312 only does simplified characters.
Firefox, Mozilla, and python2-chardet-2.0.1 all lie and say this is gb2312, but since it has traditional characters, the charset needed is gb18030.
So: if it looks like a duck, quacks like a duck, smells like a duck, and tastes like a duck... open the file with hexdump and read the codepage definition file.
I have vim set to use UTF-8, as as far as I can see the characters correctly. Maybe try
set encoding=utf-8