![]() |
Full Text Index - Printable Version +- LetoDMS Community Forum (https://community.letodms.com) +-- Forum: LetoDMS Support (https://community.letodms.com/forumdisplay.php?fid=4) +--- Forum: Technical Support (https://community.letodms.com/forumdisplay.php?fid=10) +--- Thread: Full Text Index (/showthread.php?tid=522) |
Full Text Index - Daanl - 07-09-2012 Good morning, My LetoDMS installation works 100%. I have uploaded pdf files to the server but when I try to index the file it seems like it is only indexing the file name. I loaded OCR pdf file on the server and I would like to index the entire file and do a full text search. Can someone possibly help with this on how can I create a full text index of the pdf file and not just index the file name. Regards RE: Full Text Index - steinm - 07-09-2012 (07-09-2012, 01:46 PM)Daanl Wrote: Good morning, You need pdftotext for it. Check if it is installed. Uwe RE: Full Text Index - Daanl - 07-10-2012 (07-09-2012, 06:48 PM)steinm Wrote:(07-09-2012, 01:46 PM)Daanl Wrote: Good morning, Hi Uwe, pdftotext is installed. When I click on Create index it shows Recreating index D DMS D Testing 1:511.3 BLOCH. 2000. Proofs and fundamentals.pdf (document added) and when I click on Fulltext index info it shows 8 Terms document_id:1 mimetype:application/x-unknown owner:admin title:and title:bloch title:fundamentals title:pdf title:proofs Regards,Daan RE: Full Text Index - steinm - 07-12-2012 (07-10-2012, 03:29 PM)Daanl Wrote:(07-09-2012, 06:48 PM)steinm Wrote:(07-09-2012, 01:46 PM)Daanl Wrote: Good morning, The problem is the mimetype of the document. It's application/x-unknown and that is not run through any command. It should be application/pdf. Uwe RE: Full Text Index - atarifreak - 09-27-2012 (07-12-2012, 07:04 PM)steinm Wrote: The problem is the mimetype of the document. It's application/x-unknown and that is not run through any command. It should be application/pdf. well, i have same problem. but for me mimetype is pdf... Code: document_id:9 and just for the record: using pdftotext on that pdf created much more txt that i expected. so its not a problem with that textfile. RE: Full Text Index - DerMac - 09-28-2012 Hello, same problem here. * letodms works with no errors * pdf can be imported * Full-Index created Code: Recreating index Code: ls -la /volume1/letoDMS/lucene/ Code: 5 Terms Code: /tmp $ pdftotext pdf_barrierefrei.pdf text.txt It seems that pdftotext is not called. Is it possible to debug the process? Regards RE: Full Text Index - steinm - 10-01-2012 (09-28-2012, 09:04 PM)DerMac Wrote: It seems that pdftotext is not called. Just put some echos in LetoDMS_Lucene/Lucene/IndexedDocument.php. you should var_dump $_convcmd. It contains the conversion programms. Uwe RE: Full Text Index - atarifreak - 10-03-2012 (10-01-2012, 06:03 PM)steinm Wrote:(09-28-2012, 09:04 PM)DerMac Wrote: It seems that pdftotext is not called. sorry, i cant do that. can you please explain exactly what to do? RE: Full Text Index - DerMac - 10-03-2012 Thank you! With your advice I found the problem. I installed the DMS on a Synology DiskStation. The PHP config variable 'safe_mode_exec_dir' is set to a special SubDir. I tried it with a symbolic link in this SubDir to pdftotext, but the same result. So I had to unset this variable (via the web front end of the box). Now the full index runs. Regards. RE: Full Text Index - atarifreak - 10-03-2012 (10-03-2012, 09:53 PM)DerMac Wrote: Thank you! With your advice I found the problem. Thank you for that information. Can you tell me how to debug that process? i am not that php-coder but i know how to use vi :-) i will install letodms on my synology too but first want to check with lampp. so i need to unset safe_mode_exec_dir? is this done by just safe_mode_exec_dir = in php.ini? but do this anything if safe_mode = off ? |