Ñæàòèå òåêñòîâ
Àíãëèéñêèå ìàòåðèàëû Èñõîäíûå òåêñòû êîìïðåññîðîâ
Ðóññêèå ìàòåðèàëû |
|||
Àâòîðû | Íàçâàíèå ñòàòüè | Îïèñàíèå | Ðåéòèíã |
Øåëâèí Å. | Çàäà÷êà ñæàòèÿ ñëîâàðÿ | Ïîñòàíîâêà çàäà÷è ðàçðàáîòêè ýôôåêòèâíîé ìîäåëè äëÿ ñëîâàðÿ.
HTML |
|
Êàäà÷ À.Â. | Ñæàòèå òåêñòîâ è ãèïåðòåêñòîâ | Ðàññìîòðåí ìåòîä ñæàòèÿ òåêñòîâ íà åñòåñòâåííûõ ÿçûêàõ, îñíîâàííûé íà çàìåíå ñëîâ òåêñòà èõ
íîìåðàìè â ñëîâàðå, óïîðÿäî÷åííîì ïî ÷àñòîòå óïîòðåáëåíèÿ ñëîâ, ïîçâîëÿþùèé äåêîäèðîâàòü ïðîèçâîëüíûé ó÷àñòîê òåêñòà, ÷òî íåâîçìîæíî ïðè èñïîëüçîâàíèè èçâåñòíûõ ìåòîäîâ ñæàòèÿ... Äàííàÿ ñòàòüÿ ïîëíîñòüþ âîøëà â
äèññåðòàöèþ àâòîðà, ïîýòîìó ïðåäñòàâëåíà çäåñü ïðîñòî "äëÿ ïîðÿäêà".
//Ïðîãðàììèðîâàíèå, 1997, N4, Ñ. 47-56. PDF.RAR 827 êáàéò |
|
Ñìèðíîâ Ì.À. | Èñïîëüçîâàíèå ìåòîäîâ ñæàòèÿ äàííûõ áåç ïîòåðü èíôîðìàöèè â óñëîâèÿõ æåñòêèõ îãðàíè÷åíèé íà ðåñóðñû óñòðîéñòâà-äåêîäåðà | Íåáîëüøîå èññëåäîâàíèå âîïðîñà ñæàòèÿ äàííûõ ïðè æåñòêèõ îãðàíè÷åíèÿõ íà ðåñóðñû äåêîäåðà, â ïåðâóþ î÷åðåäü ïî ïàìÿòè. Ñðàâíèâàåòñÿ ýôôåêòèâíîñòü ðàçëè÷íûõ ìåòîäîâ ïðè àäàïòèâíîì è ñòàòè÷åñêîì ïîäõîäàõ. Äëÿ ñðàâíèâàåìûõ ïðîãðàìì ïîêàçûâàåòñÿ âçàèìîñâÿçü äîñòèãàåìîãî êîýôôèöèåíòà ñæàòèÿ, ñêîðîñòè äåêîäèðîâàíèÿ è òðåáóåìîãî äëÿ äåêîäèðîâàíèÿ îáúåìà ïàìÿòè. Îñíîâíîå âíèìàíèå óäåëÿåòñÿ ýêîíîìíîìó êîäèðîâàíèþ òåêñòà íà åñòåñòâåííîì ÿçûêå.
Îòðåäàêòèðîâàííàÿ âåðñèÿ äàííîãî òåêñòà áûëà îïóáëèêîâàíà êàê: Îñèïîâ Ë.À., Ñìèðíîâ Ì.À. Èñïîëüçîâàíèå ìåòîäîâ ñæàòèÿ äàííûõ áåç ïîòåðü èíôîðìàöèè â óñëîâèÿõ æåñòêèõ îãðàíè÷åíèé íà ðåñóðñû óñòðîéñòâà-äåêîäåðà //Èíôîðìàöèîííî-óïðàâëÿþùèå ñèñòåìû, 2004. - N4. - Ñ.7-15. 2004 HTML 220 êáàéò PDF 165 êáàéò |
|
Ñìèðíîâ Ì.À. | Ìåòîäû ïîâûøåíèÿ ñòåïåíè ñæàòèÿ òåêñòîâ íà åñòåñòâåííûõ ÿçûêàõ äëÿ àëãîðèòìîâ íåèñêàæàþùåãî ñæàòèÿ äàííûõ | Ïîêàçûâàåòñÿ âîçìîæíîñòü çàìåòíîãî óâåëè÷åíèÿ ñòåïåíè ñæàòèÿ òåêñòîâ íà åñòåñòâåííûõ ÿçûêàõ çà ñ÷åò ó÷åòà ãðàììàòèêè ÿçûêà áåç íåïîñðåäñòâåííîãî ïîñòðîåíèÿ ñîîòâåòñòâóþùåé âåðîÿòíîñòíîé ìîäåëè. Ñ öåëüþ óñèëåíèÿ ñæàòèÿ òåêñòîâûõ äàííûõ ïðåäëàãàåòñÿ ïðîñòàÿ ñõåìà ïðåäâàðèòåëüíîé îáðàáîòêè (íà îñíîâå LIPT), îñîáåííîñòü êîòîðîé ñîñòîèò â ðàññòàíîâêå ìàðêåðîâ (òåãîâ) ïðèíàäëåæíîñòè ñëîâà ê íåêîòîðîé ÷àñòè ðå÷è.
2002 HTML 110 êáàéò PDF 102 êáàéò |
|
Àíãëèéñêèå ìàòåðèàëû | |||
Teahan W.J. | Modelling English text | Äàííàÿ äèññåðòàöèÿ ïîñâÿùåíà èçó÷åíèþ ñòàòèñòè÷åñêèõ ìîäåëåé òåêñòà. Áîëüøîå âíèìàíèå óäåëÿåòñÿ ìîäåëÿì êëàññà Prediction by Partial Matching (PPM). Èññëåäîâàíî íåñêîëüêî ñïîñîáîâ ïîâûøåíèÿ òî÷íîñòè ìîäåëåé òåêñòà (è, ñëåäîâàòåëüíî, ñæàòèÿ, åñëè ìîäåëè èñïîëüçóþòñÿ â êîìïðåññîðàõ).
Department of Computer Science, The University of Waikato, Hamilton, New Zealand, May 1998. PDF.RAR 1674 êáàéò PS.RAR 958 êáàéò |
|
Kruse H., Mukherjee A. | Improve Text Compression Ratios with Burrows-Wheeler Transform | Ýôôåêòèâíîñòü ïðåïðîöåññèíãà òåêñòîâ ïðè èõ ñæàòèè ñ ïîìîùüþ BWT-êîìïðåññîðîâ: èçìåíåíèå àëôàâèòà è ñëîâàðíîå ïðåîáðàçîâàíèå.
Department of Computer Science, The University of Waikato, Hamilton, New Zealand, May 1998. PDF.RAR 113 êáàéò PS.RAR 56 êáàéò |
|
Awan F., Mukherjee A. | LIPT: A Lossless Text Transform to Improve Compression | Ïðåïðîöåññèíã òåêñòîâ ñ ïîìîùüþ àëãîðèòìà ñëîâàðíîãî ïðåîáðàçîâàíèÿ LIPT.
Proceedings of International Conference on Information and Theory: Coding and Computing, IEEE Computer Society, Las Vegas Nevada, April 2001. PDF.RAR 40 êáàéò |
|
Sun W., Mukherjee A., Zhang N. | A Dictionary-Based Multi-Corpora Text Compression System | Îïèñàíèå StarNT -- óñîâåðøåíñòâîâàííîãî LIPT. Çà ñ÷åò ïðîñòûõ èçìåíåíèé àëãîðèòìà LIPT ÷àñòî ìîæíî óëó÷øèòü ñæàòèå íà íåñêîëüêî ïðîöåíòîâ îòíîñèòåëüíî èñõîäíîãî.
2003. Ðàáîòà áûëà ïðåäñòàâëåíà â ìàòåðèàëàõ êîíôåðåíöèè DCC'03 îäíîñòðàíè÷íûìè òåçèñàìè. PDF.RAR 134 êáàéò |
|
Grabowski, Sz. | Text preprocessing for Burrows-Wheeler block sorting compression | Ýôôåêòèâíîñòü ïðåïðîöåññèíãà òåêñòîâ: ïðåîáðàçîâàíèå çàãëàâíûõ áóêâ, ìîäèôèêàöèÿ ðàçäåëèòåëåé, ñëîâàðíîå ïðåîáðàçîâàíèå. Íà ïðèìåðå BWT-êîìïðåññîðîâ.
VII Konferencja "Sieci i Systemy Informatyczne" (7th Conference "Networks and IT Systems"), Lodz, Oct. 1999, conf. proc., pp. 229-239. PDF.RAR 68 êáàéò RTF.RAR 14 êáàéò |
|
Fenwick P., Brierley S. | Compression of Unicode files | Èññëåäîâàíèå ýôôåêòèâíîñòè ñæàòèÿ òåêñòîâ â ðàçëè÷íûõ ôîðìàòàõ, â òîì ÷èñëå Unicode, ñ ïîìîùüþ àëãîðèòìîâ ðàçíûõ òèïîâ.
Department of Computer Science, The University of Auckland, 1998. PDF.RAR 59 êáàéò |
|
Moffat A., Sharman N., Zobel J. | Static Compression for Dynamic Texts | Two problems arise when semi-static word-based compression methods are applied to large texts, such as those stored in information retrieval systems. First, the space required for the model during decoding can become very large. Second, the need to handle document insertions means that the collection must be periodically recompressed if compression effciency is to be maintained. Here we show that with careful management the impact of both of these drawbacks can be minimised...
Proceedings of the 1994 IEEE Data Compression Conference, Snowbird, Utah, March 1994. PDF.RAR 138 êáàéò |
|
Witten I., Bell T., Moffat A., Nevill-Manning C., Smith T., Thimbleby H. | Semantic and Generative Models for Lossy Text Compression | Ðàññìîòðåíî íåñêîëüêî ñïîñîáîâ ñæàòèÿ òåêñòà ñ ïîòåðÿìè. ×òî-òî áëèæå ê ðåôåðèðîâàíèþ, ÷òî-òî ê ïðåîáðàçîâàíèþ ñ íå î÷åíü áîëüøèìè ïîòåðÿìè ñ òî÷êè çðåíèÿ ñåìàíòèêè. Òåõíèêè ìîãóò áûòü ïîëåçíû è äëÿ áåñïîòåðüíîãî ñæàòèÿ.
The Computer Journal, Vol.37, No.2, pp.83-87, 1994. PDF 81 êáàéò |
|
Abel J., Teahan W. | Text Preprocessing for Data Compression | Èñ÷åðïûâàþùåå îïèñàíèå ñóùåñòâóþùèõ ñïîñîáîâ ïîäãîòîâêè òåêñòà äëÿ ïîñëåäóþùåãî ýôôåêòèâíîãî ñæàòèÿ
IEEE, 2003 PDF 92 êáàéò |
|
Horspool R.N., Cormack G.V. | Constructing Word-Based Text Compression Algorithms | Ðàññìîòðåíî 4 àëãîðèòìà ïîñëîâíîãî ñæàòèÿ íà îñíîâå: àäàïòèâíîãî
êîäèðîâàíèÿ ïî Õàôôìàíó, LZW, PPM 1-0, êîíòåêñòíîãî ìîäåëèðîâàíèÿ ïåðâîãî ïîðÿäêà ñ ó÷åòîì
ïðåäïîëàãàåìîé ÷àñòè ðå÷è. Äàþòñÿ ñðàâíèòåëüíûå ðåçóëüòàòû íà íåñêîëüêèõ íåáîëüøèõ ôàéëàõ. Âåñüìà ëþáîïûòíàÿ
ñòàòüÿ, íåñìîòðÿ íà åå âîçðàñò. Áûëî áû èíòåðåñíî ïîñìîòðåòü íà ñîâðåìåííûå ðåàëèçàöèè
ñ áîëåå ñëîæíûìè ñõåìàìè ìîäåëèðîâàíèÿ, èñïîëüçóþùèå áîëüøèé îáúåì ïàìÿòè.
Ñòðàíèöà ïóáëèêàöèé R. Nigel Horspool'à Proceedings of IEEE Data Compression Conference (DCC'92), Snowbird, UT, March 1992, pp. 62-71. PDF.RAR 23 êáàéò |
|
Skibinski P., Grabowski Sz., Deorowicz S. | Revisiting dictionary-based compression | Åùå îäèí âàðèàíò LIPT. Çà ñ÷åò íåñêîëüêèõ ìîäèôèêàöèé è äîïîëíèòåëüíîãî ïðåïðîöåññèíãà óäàëîñü ñóùåñòâåííî ïðåâûñèòü ïîêàçàòåëè StarNT (LIPT-êëîí).
Îïèñàíî áîëüøîå ÷èñëî ôèíòîâ óøàìè ïðè ñæàòèè òåêñòîâ. Ìîæíî èñïîëüçîâàòü â êà÷åñòâå ðåôåðàòèâíîé ñòàòüè ïî òåìå.
15.01.2005. Ïðèíÿòî ê ïóáëèêàöèè â æóðíàëå "Software–Practice and Experience" PDF.RAR 224 êáàéò |
|
Mahoney M. | Fast Text Compression with Neural Networks | Ñêàç î òîì, ÷òî îò èñêóññòâåííûõ íåéðîííûõ ñåòåé èíîãäà áûâàåò íåêîòîðàÿ ïîëüçà. Äîñòèãíóòî ñæàòèå íà óðîâíå ïðîñòîãî PPM ïðè ïðèåìëåìîé ñêîðîñòè ðàáîòû.
Proceedings of the Thirteenth International Florida Artificial Intelligence Research Society Conference, 2000, pp.230-324. Äîìàøíÿÿ ñòðàíèöà PDF.RAR 79 êáàéò |
|
Èñõîäíûå òåêñòû êîìïðåññîðîâ |
íàâåðõ