Location: Blogs OpenDNN blog Open-SearchEngine |
 |
| Posted by: Xepient Solutions |
3/5/2008 |
The search engine log may may show that the page was indexed, but when you do a search on a keyword, there are no results. How can you verify that the content has really been indexed?
The search engine log may may show that the page was indexed, but when you do a search on a keyword, there are no results. How can you verify that the content has really been indexed?
Visual Inspection:
To make sure that the words are contained in the index that the engine produces you can visually inspect the indexes.
To do that, look for the <dnn root>\DesktopModules\XSSearchInput\index directory where DNN is installed.
In that directory you will see a series of sub-directories (with the names of the URLs that youy are spidering) with names ending in .out.
Those directories are were the indexes are stored. The indexes are a series of text files ... Just view them in notepad and look for the keywords you expect to find. The same is true for word (MSOffice) and .pdf documents, that will be translated to pure text and stored into these indexes.
If they are not there it's because they were not indexed. As O-SE utilizes Lucene as the indexing engine, it has little or no control over what words do make it and what words do not.
Also, it may depend on the html parsing routines we use in order to clean the contents of the spidered pages, but if the page is well formed, all words should pass the filtering.
|
|
| Permalink |
Trackback |
Comments (11)
Add Comment
|
Re: O-SE: how do I know if content has been indexed? |
By Vedran on
9/12/2008 |
| Hello!<br>Great module! We think of buying it. I'm interested if Open-SearchEngine can also index description and keywords defined in page settings (i.e. meta tegs in head section of html)? |
|
|
Re: O-SE: how do I know if content has been indexed? |
By host on
9/12/2008 |
| Thank you Verdan,<br><br>the module indexes the following meta tags: title and description. all others are not included.<br><br>We could easily add more metatags and include them in a next minor release... I'll place it in the TODO list.<br> |
|
|
Re: O-SE: how do I know if content has been indexed? |
By Kiran Sonawane on
9/26/2008 |
| I am using trial version of open search<br><br>The search engine log may may show that the page was indexed, but when you do a search on a keyword, there are no results. I checked the Desktop module directory. the pdf is indexed "localhost.camba.Portals.0.Program.O-SE.v2.0.OwnerManual.pdf.out"<br><br>But while searching no result. I appreciate if someone help |
|
|
Re: O-SE: how do I know if content has been indexed? |
By benchmark on
9/26/2008 |
| The search engine log may may show that the page was indexed, but when you do a search on a keyword, there are no results<br><br>Please help. I am using trial version. I have do all the setting in admin.<br>also added 3 dll which needs for indexing |
|
|
Re: O-SE: how do I know if content has been indexed? |
By host on
9/26/2008 |
| Hello Kiran and Benchmark,<br><br>it see ms that you are having the same issue at the same time.<br><br>Since you have both followed the instructions and have verified that the index has been created, the next step is to open and visually inspect the files in the index (they can be opened with notepad regardless of the extension) and do a notepad "find" of the keyword you were looking for through the search engine. If the text is to be found, it must be within one of the index files. Otherwise it will not be found.<br><br><br>If the word is not there, then you should verify that other text from your document is contained in the index... if some text is and other text has been excluded, it would be an issue that could depend on the formatting of the html or pdf (but it is highly unlikely that this will happen)<br><br>Also, sometime words like articles and prepositions are considered noise, and will not be searched upon.<br><br>Regards,<br>Xepient Solutions |
|
|
Re: O-SE: how do I know if content has been indexed? |
By benchmark on
9/27/2008 |
| Thanks for quick reply.<br><br>I followed you instruction. Open the file (named as _3.fdt) in notepad and search for the word "ISearcheable". It found twice. Same word i am searching the my website, unfortunately, it gives no result. I had planned to purchase this OSE. Till now my trail version is not working in my DNN application.<br><br>Please let me know your thought on this.<br><br>Thanks |
|
|
Re: O-SE: how do I know if content has been indexed? |
By benchmark on
9/27/2008 |
| I placed the pdf file in Portals\0\Program directory and this directory set for index in admin panel |
|
|
Re: O-SE: how do I know if content has been indexed? |
By host on
9/27/2008 |
| Please try the following:<br>1. in the search results module settings, make sure that the directory you indexed is selected as one of the indexes ... the parameter is called "Search Scope"<br>2. type th search word with the same casing (upper/lower)<br><br> |
|
|
Re: O-SE: how do I know if content has been indexed? |
By host on
9/27/2008 |
| Please send the .pdf file to info[@]xepient.com we will test it and get back to you as quickly as possible. |
|
|
How to refresh the search result? |
By selamat on
5/22/2009 |
| We are using your search engine module v1.0.<br>The problem is the search result. I have deleted the file (x.pdf) but the result still showing the x.pdf.<br>Is there any option to refresh the result?<br><br>Thank you |
|
|
Re: O-SE: how do I know if content has been indexed? |
By host on
5/22/2009 |
| Hi selamat, <br><br>as you experienced, deleting the physical file, does not remove content from the search results. This is so, because all search results are taken from teh catalog that the engine creates.<br><br>Thus, to remove content, teh catalog has to change. This change can be achieved by re-indexing teh website (running teh scheduled task again).<br><br>The scheduled task will run with the frequency you have established, or it can be forced to run on demand. To force a run, you can follow the steps otlined in teh owner's manual. (basically, edit teh scheduled task, and click update twice).<br><br><br>you will need to re-run teh scheduled task. |
|
|
|
|
|
|