Ñæàòèå òåêñòîâ


Àíãëèéñêèå ìàòåðèàëû   Èñõîäíûå òåêñòû êîìïðåññîðîâ

Ðóññêèå ìàòåðèàëû
Àâòîðû Íàçâàíèå ñòàòüè Îïèñàíèå Ðåéòèíã
Øåëâèí Å. Çàäà÷êà ñæàòèÿ ñëîâàðÿ Ïîñòàíîâêà çàäà÷è ðàçðàáîòêè ýôôåêòèâíîé ìîäåëè äëÿ ñëîâàðÿ.
HTML
Êàäà÷ À.Â. Ñæàòèå òåêñòîâ è ãèïåðòåêñòîâ Ðàññìîòðåí ìåòîä ñæàòèÿ òåêñòîâ íà åñòåñòâåííûõ ÿçûêàõ, îñíîâàííûé íà çàìåíå ñëîâ òåêñòà èõ íîìåðàìè â ñëîâàðå, óïîðÿäî÷åííîì ïî ÷àñòîòå óïîòðåáëåíèÿ ñëîâ, ïîçâîëÿþùèé äåêîäèðîâàòü ïðîèçâîëüíûé ó÷àñòîê òåêñòà, ÷òî íåâîçìîæíî ïðè èñïîëüçîâàíèè èçâåñòíûõ ìåòîäîâ ñæàòèÿ... Äàííàÿ ñòàòüÿ ïîëíîñòüþ âîøëà â äèññåðòàöèþ àâòîðà, ïîýòîìó ïðåäñòàâëåíà çäåñü ïðîñòî "äëÿ ïîðÿäêà".
//Ïðîãðàììèðîâàíèå, 1997, N4, Ñ. 47-56.
PDF.RAR  827 êáàéò
5
Ñìèðíîâ Ì.À. Èñïîëüçîâàíèå ìåòîäîâ ñæàòèÿ äàííûõ áåç ïîòåðü èíôîðìàöèè â óñëîâèÿõ æåñòêèõ îãðàíè÷åíèé íà ðåñóðñû óñòðîéñòâà-äåêîäåðà Íåáîëüøîå èññëåäîâàíèå âîïðîñà ñæàòèÿ äàííûõ ïðè æåñòêèõ îãðàíè÷åíèÿõ íà ðåñóðñû äåêîäåðà, â ïåðâóþ î÷åðåäü ïî ïàìÿòè. Ñðàâíèâàåòñÿ ýôôåêòèâíîñòü ðàçëè÷íûõ ìåòîäîâ ïðè àäàïòèâíîì è ñòàòè÷åñêîì ïîäõîäàõ. Äëÿ ñðàâíèâàåìûõ ïðîãðàìì ïîêàçûâàåòñÿ âçàèìîñâÿçü äîñòèãàåìîãî êîýôôèöèåíòà ñæàòèÿ, ñêîðîñòè äåêîäèðîâàíèÿ è òðåáóåìîãî äëÿ äåêîäèðîâàíèÿ îáúåìà ïàìÿòè. Îñíîâíîå âíèìàíèå óäåëÿåòñÿ ýêîíîìíîìó êîäèðîâàíèþ òåêñòà íà åñòåñòâåííîì ÿçûêå.
Îòðåäàêòèðîâàííàÿ âåðñèÿ äàííîãî òåêñòà áûëà îïóáëèêîâàíà êàê:
Îñèïîâ Ë.À., Ñìèðíîâ Ì.À. Èñïîëüçîâàíèå ìåòîäîâ ñæàòèÿ äàííûõ áåç ïîòåðü èíôîðìàöèè â óñëîâèÿõ æåñòêèõ îãðàíè÷åíèé íà ðåñóðñû óñòðîéñòâà-äåêîäåðà //Èíôîðìàöèîííî-óïðàâëÿþùèå ñèñòåìû, 2004. - N4. - Ñ.7-15.
2004
HTML  220 êáàéò
PDF    165 êáàéò
5
Ñìèðíîâ Ì.À. Ìåòîäû ïîâûøåíèÿ ñòåïåíè ñæàòèÿ òåêñòîâ íà åñòåñòâåííûõ ÿçûêàõ äëÿ àëãîðèòìîâ íåèñêàæàþùåãî ñæàòèÿ äàííûõ Ïîêàçûâàåòñÿ âîçìîæíîñòü çàìåòíîãî óâåëè÷åíèÿ ñòåïåíè ñæàòèÿ òåêñòîâ íà åñòåñòâåííûõ ÿçûêàõ çà ñ÷åò ó÷åòà ãðàììàòèêè ÿçûêà áåç íåïîñðåäñòâåííîãî ïîñòðîåíèÿ ñîîòâåòñòâóþùåé âåðîÿòíîñòíîé ìîäåëè. Ñ öåëüþ óñèëåíèÿ ñæàòèÿ òåêñòîâûõ äàííûõ ïðåäëàãàåòñÿ ïðîñòàÿ ñõåìà ïðåäâàðèòåëüíîé îáðàáîòêè (íà îñíîâå LIPT), îñîáåííîñòü êîòîðîé ñîñòîèò â ðàññòàíîâêå ìàðêåðîâ (òåãîâ) ïðèíàäëåæíîñòè ñëîâà ê íåêîòîðîé ÷àñòè ðå÷è.
2002
HTML  110 êáàéò
PDF    102 êáàéò
?


Àíãëèéñêèå ìàòåðèàëû
Teahan W.J. Modelling English text Äàííàÿ äèññåðòàöèÿ ïîñâÿùåíà èçó÷åíèþ ñòàòèñòè÷åñêèõ ìîäåëåé òåêñòà. Áîëüøîå âíèìàíèå óäåëÿåòñÿ ìîäåëÿì êëàññà Prediction by Partial Matching (PPM). Èññëåäîâàíî íåñêîëüêî ñïîñîáîâ ïîâûøåíèÿ òî÷íîñòè ìîäåëåé òåêñòà (è, ñëåäîâàòåëüíî, ñæàòèÿ, åñëè ìîäåëè èñïîëüçóþòñÿ â êîìïðåññîðàõ).
Department of Computer Science, The University of Waikato, Hamilton, New Zealand, May 1998.
PDF.RAR  1674 êáàéò
PS.RAR    958 êáàéò
5
Kruse H., Mukherjee A. Improve Text Compression Ratios with Burrows-Wheeler Transform Ýôôåêòèâíîñòü ïðåïðîöåññèíãà òåêñòîâ ïðè èõ ñæàòèè ñ ïîìîùüþ BWT-êîìïðåññîðîâ: èçìåíåíèå àëôàâèòà è ñëîâàðíîå ïðåîáðàçîâàíèå.
Department of Computer Science, The University of Waikato, Hamilton, New Zealand, May 1998.
PDF.RAR  113 êáàéò
PS.RAR    56 êáàéò
4
Awan F., Mukherjee A. LIPT: A Lossless Text Transform to Improve Compression Ïðåïðîöåññèíã òåêñòîâ ñ ïîìîùüþ àëãîðèòìà ñëîâàðíîãî ïðåîáðàçîâàíèÿ LIPT.
Proceedings of International Conference on Information and Theory: Coding and Computing, IEEE Computer Society, Las Vegas Nevada, April 2001.
PDF.RAR  40 êáàéò
5
Sun W., Mukherjee A., Zhang N. A Dictionary-Based Multi-Corpora Text Compression System Îïèñàíèå StarNT -- óñîâåðøåíñòâîâàííîãî LIPT. Çà ñ÷åò ïðîñòûõ èçìåíåíèé àëãîðèòìà LIPT ÷àñòî ìîæíî óëó÷øèòü ñæàòèå íà íåñêîëüêî ïðîöåíòîâ îòíîñèòåëüíî èñõîäíîãî.
2003.
Ðàáîòà áûëà ïðåäñòàâëåíà â ìàòåðèàëàõ êîíôåðåíöèè DCC'03 îäíîñòðàíè÷íûìè òåçèñàìè.
PDF.RAR  134 êáàéò
4
Grabowski, Sz. Text preprocessing for Burrows-Wheeler block sorting compression Ýôôåêòèâíîñòü ïðåïðîöåññèíãà òåêñòîâ: ïðåîáðàçîâàíèå çàãëàâíûõ áóêâ, ìîäèôèêàöèÿ ðàçäåëèòåëåé, ñëîâàðíîå ïðåîáðàçîâàíèå. Íà ïðèìåðå BWT-êîìïðåññîðîâ.
VII Konferencja "Sieci i Systemy Informatyczne" (7th Conference "Networks and IT Systems"), Lodz, Oct. 1999, conf. proc., pp. 229-239.
PDF.RAR  68 êáàéò
RTF.RAR  14 êáàéò
5
Fenwick P., Brierley S. Compression of Unicode files Èññëåäîâàíèå ýôôåêòèâíîñòè ñæàòèÿ òåêñòîâ â ðàçëè÷íûõ ôîðìàòàõ, â òîì ÷èñëå Unicode, ñ ïîìîùüþ àëãîðèòìîâ ðàçíûõ òèïîâ.
Department of Computer Science, The University of Auckland, 1998.
PDF.RAR  59 êáàéò
3
Moffat A., Sharman N., Zobel J. Static Compression for Dynamic Texts Two problems arise when semi-static word-based compression methods are applied to large texts, such as those stored in information retrieval systems. First, the space required for the model during decoding can become very large. Second, the need to handle document insertions means that the collection must be periodically recompressed if compression effciency is to be maintained. Here we show that with careful management the impact of both of these drawbacks can be minimised...
Proceedings of the 1994 IEEE Data Compression Conference, Snowbird, Utah, March 1994.
PDF.RAR  138 êáàéò
?
Witten I., Bell T., Moffat A., Nevill-Manning C., Smith T., Thimbleby H. Semantic and Generative Models for Lossy Text Compression Ðàññìîòðåíî íåñêîëüêî ñïîñîáîâ ñæàòèÿ òåêñòà ñ ïîòåðÿìè. ×òî-òî áëèæå ê ðåôåðèðîâàíèþ, ÷òî-òî ê ïðåîáðàçîâàíèþ ñ íå î÷åíü áîëüøèìè ïîòåðÿìè ñ òî÷êè çðåíèÿ ñåìàíòèêè. Òåõíèêè ìîãóò áûòü ïîëåçíû è äëÿ áåñïîòåðüíîãî ñæàòèÿ.
The Computer Journal, Vol.37, No.2, pp.83-87, 1994.
PDF  81 êáàéò
5
Abel J., Teahan W. Text Preprocessing for Data Compression Èñ÷åðïûâàþùåå îïèñàíèå ñóùåñòâóþùèõ ñïîñîáîâ ïîäãîòîâêè òåêñòà äëÿ ïîñëåäóþùåãî ýôôåêòèâíîãî ñæàòèÿ
IEEE, 2003
PDF  92 êáàéò
Horspool R.N., Cormack G.V. Constructing Word-Based Text Compression Algorithms Ðàññìîòðåíî 4 àëãîðèòìà ïîñëîâíîãî ñæàòèÿ íà îñíîâå: àäàïòèâíîãî êîäèðîâàíèÿ ïî Õàôôìàíó, LZW, PPM 1-0, êîíòåêñòíîãî ìîäåëèðîâàíèÿ ïåðâîãî ïîðÿäêà ñ ó÷åòîì ïðåäïîëàãàåìîé ÷àñòè ðå÷è. Äàþòñÿ ñðàâíèòåëüíûå ðåçóëüòàòû íà íåñêîëüêèõ íåáîëüøèõ ôàéëàõ. Âåñüìà ëþáîïûòíàÿ ñòàòüÿ, íåñìîòðÿ íà åå âîçðàñò. Áûëî áû èíòåðåñíî ïîñìîòðåòü íà ñîâðåìåííûå ðåàëèçàöèè ñ áîëåå ñëîæíûìè ñõåìàìè ìîäåëèðîâàíèÿ, èñïîëüçóþùèå áîëüøèé îáúåì ïàìÿòè.
Ñòðàíèöà ïóáëèêàöèé R. Nigel Horspool'à
Proceedings of IEEE Data Compression Conference (DCC'92), Snowbird, UT, March 1992, pp. 62-71.
PDF.RAR  23 êáàéò
4+
Skibinski P., Grabowski Sz., Deorowicz S. Revisiting dictionary-based compression Åùå îäèí âàðèàíò LIPT. Çà ñ÷åò íåñêîëüêèõ ìîäèôèêàöèé è äîïîëíèòåëüíîãî ïðåïðîöåññèíãà óäàëîñü ñóùåñòâåííî ïðåâûñèòü ïîêàçàòåëè StarNT (LIPT-êëîí). Îïèñàíî áîëüøîå ÷èñëî ôèíòîâ óøàìè ïðè ñæàòèè òåêñòîâ. Ìîæíî èñïîëüçîâàòü â êà÷åñòâå ðåôåðàòèâíîé ñòàòüè ïî òåìå.
15.01.2005. Ïðèíÿòî ê ïóáëèêàöèè â æóðíàëå "Software–Practice and Experience"
PDF.RAR  224 êáàéò
4+
Mahoney M. Fast Text Compression with Neural Networks Ñêàç î òîì, ÷òî îò èñêóññòâåííûõ íåéðîííûõ ñåòåé èíîãäà áûâàåò íåêîòîðàÿ ïîëüçà. Äîñòèãíóòî ñæàòèå íà óðîâíå ïðîñòîãî PPM ïðè ïðèåìëåìîé ñêîðîñòè ðàáîòû.
Proceedings of the Thirteenth International Florida Artificial Intelligence Research Society Conference, 2000, pp.230-324.
Äîìàøíÿÿ ñòðàíèöà
PDF.RAR  79 êáàéò
5


Èñõîäíûå òåêñòû êîìïðåññîðîâ

íàâåðõ