This command is available from the Authoring menu in the FAR main window.
The command is disabled until you add HTML files [.htm|.html|.asp|.hta|.mht] into the FAR file list. Only add the HTML files you want rip to the JS search data. Click Build to collect all search terms from all HTML files and generate a searchdata.js file, then go to the Test page and generate a test search system. You can actually use this on your own help web. But primarily the output file searchdata.js is consumed by FAR uncompressed help dialog (see Authoring > Make Uncompressed Help dialog) which generates TOC, Index and Search navigation for your site.
Past versions of FAR were ANSI based. Which meant you were limited to working with a single foreign language (which had to match the Operation System language) + English (which is part of every ANSI code page). Very limiting and caused a lot of confusion.
FAR v5 is now Unicode which means you can now work with all languages simultaneously. The search database can contain search terms from German, French, Japanese, Chinese etc all together.
Just make sure that if you do have foreign language source HTML files that do not match the operating system language make sure they are in Unicode (UTF-8 or UTF-16) format so FAR can index them correctly.
- You may ship or publish the FAR Search solution only if you have purchased a copy of FAR HTML.
- We have not attempted to protect or obviscate the source files. This is so you can debug or modify the code if you need to.
- If you modify the code you must still give full credit to us and to FAR. Please do not remove any copyright notices.
Please respect our intellectual property by asking permission before redistributing or sharing source with other web developers and authors. As far as your web site is concerned you don't need to ask permission to ship our files as long as you are a licensed user of FAR HTML.
Base Dir: This read-only field is set when you add files to the FAR file list (FAR main window). Normally it is the root folder of your web site. All search data files will be created or copied to this location.
File name: This is the name of the search data file that will be created when Build button is clicked. Normally it should be left as "searchdata.js". If you change the name then be sure to either rename the file generated back to searchdata.js later OR modify the search HTML file that uses this file.
You can also enter the name of a .HTM file. FAR will create the HTML file, collect all search words found in the FAR file list, and add them to a hidden <DIV> section in the HTML file. For more info see Special Use.
These options affect which search terms FAR will store in the searchdata.js file.
Ignore Words x chars long and smaller: Check this option to exclude words up to X character in length.
Ignore words containing only numbers: Check this option to exclude words containing only numbers. EG. 1234 911
Use Stop Word List file: Enter the full path of a file containing a list of words to exclude. This is a simple text file containing one word per line (same type of file authors use in MS compiled HTML Help). If you use non-English characters then best to make sure the file format is Unicode (UTF-8 is preferred).
Additional search chars: By default FAR gathers all words containing alpha-numeric characters, and "-" and "_". Normally if say a "\" char is found in a word then FAR considers this like a space char and breaks the word into two words. If you want to allow words containing say "\" and "/" chars then simply enter \ and / into the enter field.
Alternatively you can place all your extra chars into a text file. Enter the full path of the text file into the entry field. Spaces and Control characters (such as line feeds and tabs chars) will be ignored.
FAR v5 Note: In the past (FAR v4 ANSI) to do Japanese or Chinese you had to include every foreign language character in this list. Under FAR v5 Unicode this is no longer required.
Select a Code Page to use when reading ANSI files: If you are reading foreign language ANSI files then select the Code Page here so that FAR can correctly read them. Older version of FAR required you to set the code page in the Windows Control Panel. If you have a mixture of foreign languages then you will need to convert your source files to Unicode (UTF-8 is probably best because of it's smaller size).
Break apart CJK text: Chinese/Japanese/Korean pictogram character text is often _not_ broken into words by space characters and punctuation. This is not a problem with search since checking the "Partial match" checkbox on the search form will always find the text. If you check this option, all CJK paragraphs are broken into as many sub strings as possible and added the database. So that's nice that "Partial match" is no longer required, but it rather bloats the search data file. This could cause the search page to load very slowly on the web. Recommend you leave this unchecked and train your operators to use "Partial match".
This button kicks off the generation of a new searchdata.js file into the Base Directory. If you have a lot of files to process it can take a long time to scan them all for search terms. In this case go and get a cup of coffee. :-)
Note that FAR only knows how to parse HTML files for words. If you include any other file types in the file list FAR will ignore them. Actually far considers the following file types as HTML [.htm|.html|.asp|.hta|.mht]. This list can be modified by editing the Settings.ini file (FAR.EXE folder).
HtmlHelpFileTypes = .htm|.html|.asp|.hta|.mht
However remember that other functions in FAR also use this rule. Also note that the HTML parsing function is expecting to find at least one <body> tag.
Above the Build button are some progress fields. The name of the file currently being scanned. Current File Number / Total Files to be scanned. Words found in current file / Total words found so far.
searchdata.js: This is the data file of search terms you generated on the build page.
search.js: Contains functions used to search the data file, list and display results. Do not edit this file unless you have a very good reason to do so.
search0?.html, search0?.frame.html: Search form examples. Open the frame file to test your search. You can modify these files to create your own search forms.
Apart from searchdata.js, which is created from scratch, all these files live in .\extra\ folder below the FAR.EXE folder.
Drop Down Control - This control contains a few simple search form examples. Select a search form then click the Create button.
Create button - Press the Create button to copy all search files into the Base Directory (searchdata.js is already there). Create also opens the frame file so you can test your search engine.
Windows Explorer - This button opens Windows Explorer at the current Base Directory so you can examine the files.
The procedure is quite simple.
- Add to the main FAR file list all the files you want to include into searchdata.js.
I normally set my Drop Filter to ".HTM* file only" then drag and drop the root folder of my local web onto FAR.
- Open this window and click "Build searchdata.js File". This collects all the search terms in all the HTML files and generates a data file call searchdata.js (or whatever Output file name you specified) in the base directory (which is set when you add files to the FAR file list).
- Go to the Test page, select a search form example and click "Create". This copies the selected example files into your Base Dir and opens the search in your browser so you can test it.
It depends on how big your searchdata.js file becomes. Webs containing thousands of files can end up with searchdata.js files several MBytes in size. For a web site this is a very big download. For a local file on the hard disk of a fast PC this is not a problem. On slower computers the user may experience a slight delay when the search form loads. In general a few thousand average sized topic files is not a problem. Experiment and test for yourself. Once the data loads the search operation itself is relatively fast. If the data file becomes too big then try breaking the web site into several logical areas and generate a search data file for each area.
You could create HTML files containing nothing but search keywords, that when opened diverts the user to a non-HTML file such as a PDF of JPG file. In this way you could indirectly allow non-HTML files to be present in your search results generated by any search engine. See Special Use below.
This is a multi-line search form. See our web site for a live example of this form. It works best in the left frame of a web site. The result list is configured with Target="right" so that when you click a result item the topic page opens in the right side pane (pane with name="right"). If your right pane is called say "contents" then simply change the last parameter of the DoSearch() function from 'right' to 'contents'. You can find this by searching the code in the search form Search0.HTML.
The Form section in the Body of the document is slightly more complex.
Function SearchForWords() is called whenever the search button is clicked. It calls the main function in search.js called DoSearch(). This function passes the search query terms entered to search.js which in return populates the listbox "SearchResultList" with results. If there is no list box called "SearchResultList" defined then the search results are simply written to a new HTML document.
DoSearch(s1, s2, s3, PartialMatch, Target);
The nice thing about the Search00 example is that users don't need to use the key words OR and AND and NOT. They just need to enter search terms into the appropriate entry fields. You could however alter the form and use only one entry field. If you wanted the default action to be "words OR'd" then you would pass all search words entered in the first parameter S1. Parameters S2 and S3 would be left blank. If you want the default action to be "Words AND'd" then pass all search words entered in the second parameter S2 (S1 & S3 left blank). Other examples show you how to do this.
With the first 2 fields the user can control whether a search term is OR'd or AND'd or NOT'd by prefixing each search term with a keyword OR, AND or NOT. Just as you can in MS Help or Google.
Example: Here's what happens for a search spec of "Dogs or Cats and Birds not Fish".
- Results list = All topics found containing the word "Dogs"
- Results list = Add topics containing the word "Cats"
- Results list = Reduce the results to topics containing the word "Birds".
- Results list = Reduce the results to topics not containing the word "Fish".
To put it another way: Topics containing ("Dogs" OR "Cats") AND "Birds" but NOT "Fish".
Highlighting search terms in the result topic
When you open a result topic FAR automatically highlights the search terms found in that file. This is a feature limited to Internet Explorer browsers only.
If you examine the code in search.js you will see that if the search form has a result list named "SearchResultList" then it is populated with search results. If no "SearchResultList" control is found then a new result list document is dynamically created and results are displayed in it as a simple list of links. See Examples 02 and 03 to see this in action. These are not polish examples. To modify the search results window you need to tweak the search.js function called ShowSearchResultsWindow();
Here's a novel use for this dialog that may help some authors. The problem with non-HTML files such as .PDF is that you can't search within them using the main search engine. Here's a way to get around this problem.
You could create a special HTML page that links to (or auto-loads) a .PDF file. If we added all the unique words from the PDF into a hidden section in the HTML file (and recompiled), then the user can now search for PDF keyword and find the HTML file.
If you use a .HTML file extension for the output file name, then this dialog will produce an HTML file with embedded keywords. This can also help when debugging (ie. since it lists all search terms found in a readable fashion).
Build will now created an HTML file with a hidden list of keywords.
Edit the HTML file and replace "XXX.PDF" with the real target filename.