Release-1.8.3

author: Dimitri van Heesch <dimitri@stack.nl> 2012-12-26 15:59:17 (GMT)
committer: Dimitri van Heesch <dimitri@stack.nl> 2012-12-26 15:59:17 (GMT)
commit: 48f4de5c47d55b6622b6fdc9b5c288e19d5692f9 (patch)
tree: 629c4681a5158d26512b815623754b33165d8d23 /doc/extsearch.doc
parent: fee4053bd3dd075a2dd2cba4da8166ec5307eadd (diff)
download: Doxygen-48f4de5c47d55b6622b6fdc9b5c288e19d5692f9.zip
Doxygen-48f4de5c47d55b6622b6fdc9b5c288e19d5692f9.tar.gz
Doxygen-48f4de5c47d55b6622b6fdc9b5c288e19d5692f9.tar.bz2
1 files changed, 320 insertions, 0 deletions
diff --git a/doc/extsearch.doc b/doc/extsearch.doc
new file mode 100644
index 0000000..a86d1db
--- /dev/null
+++ b/doc/extsearch.doc
@@ -0,0 +1,320 @@
+/******************************************************************************
+ *
+ * Copyright (C) 1997-2012 by Dimitri van Heesch.
+ *
+ * Permission to use, copy, modify, and distribute this software and its
+ * documentation under the terms of the GNU General Public License is hereby 
+ * granted. No representations are made about the suitability of this software 
+ * for any purpose. It is provided "as is" without express or implied warranty.
+ * See the GNU General Public License for more details.
+ *
+ * Documents produced by Doxygen are derivative works derived from the
+ * input used in their production; they are not affected by this license.
+ *
+ */
+/*! \page extsearch External Indexing and Searching
+
+\section extsearch_intro Introduction
+
+With release 1.8.3, doxygen provides the ability to search through HTML using
+an external indexing tool and search engine.
+This has several advantages:
+- For large projects it can have significant performance advantages over
+  doxygen's built-in search engine, as doxygen uses a rather simple indexing
+  algorithm.
+- It allows combining the search data of multiple projects into one index,
+  allowing a global search across multiple doxygen projects.
+- It allows adding additional data to the search index, i.e. other web pages
+  not produced by doxygen.
+- The search engine needs to run on a web server, but clients can still browse
+  the web pages locally.
+
+To avoid that everyone has to start writing their own indexer and search 
+engine, doxygen provides an example tool for each action: `doxyindexer` 
+for indexing the data and `doxysearch.cgi` for searching through the index.
+
+The data flow is shown in the following diagram:
+\dot
+digraph Flow {
+  edge [fontname="helvetica",fontsize="10pt"];
+  node [shape=ellipse,fontname="helvetica",fontsize="10pt"];
+  doxygen;
+  doxyindexer;
+  doxysearch [label="doxysearch.cgi"];
+  browser [label="HTML page\nin browser"];
+  node [shape=note];
+  searchdata [label="searchdata.xml"];
+  searchindex [label="doxysearch.db"];
+
+  doxygen -> searchdata [label=" writes"];
+  searchdata -> doxyindexer [label=" reads"];
+  doxyindexer -> searchindex [label=" writes"];
+  searchindex -> doxysearch [label=" reads"];
+  doxysearch -> browser [label=" get results "];
+  browser -> doxysearch [label=" query "];
+}
+\enddot
+
+- `doxygen` produces the raw search data
+- `doxyindexer` indexes the data into a search database `doxysearch.db`
+- when a user performs a search from a doxygen generated HTML page, 
+  the CGI binary `doxysearch.cgi` will be invoked.
+- the `doxysearch.cgi` tool will perform a query on the database and return
+  the results. 
+- The browser will show the search results.
+
+\section extsearch_config Configuring
+
+The first step is to make the search engine available via a web server.
+If you use `doxysearch.cgi` this means making the
+<a href="http://en.wikipedia.org/wiki/Common_Gateway_Interface">CGI</a> binary
+available from the web server (i.e. be able to run it from a 
+browser via an URL starting with http:)
+
+How to setup a web server is outside the scope of this document,
+but if you for instance have Apache installed, you could simply copy the 
+`doxysearch.cgi` file from doxygen's `bin` dir to the `cgi-bin` of the
+Apache web server. Read the <a href="http://httpd.apache.org/docs/2.2/howto/cgi.html">apache documentation</a> for details.
+
+To test if `doxysearch.cgi` is accessible start your web browser and
+point to URL to the binary and add `?test` at the end
+
+    http://yoursite.com/path/to/cgi/doxysearch.cgi?test
+
+You should get the following message:
+
+    Test failed: cannot find search index doxysearch.db
+
+If you use Internet Explorer you may be prompted to download a file,
+which will then contain this message. 
+
+Since we didn't create or install a doxysearch.db it is ok for the test to
+fail for this reason. How to correct this is discussed in the next section.
+
+Before continuing with the next section add the above 
+URL (without the `?test` part) to the `SEARCHENGINE_URL` tag in
+doxygen's configuration file:
+
+    SEARCHENGINE_URL = http://yoursite.com/path/to/cgi/doxysearch.cgi
+
+\subsection extsearch_single Single project index
+
+To use the external search option, make sure the following options are enabled
+in doxygen's configuration file:
+
+    SEARCHENGINE           = YES
+    SERVER_BASED_SEARCH    = YES
+    EXTERNAL_SEARCH        = YES
+
+This will make doxygen generate a file called `searchdata.xml` in the output 
+directory (configured with \ref cfg_output_directory "OUTPUT_DIRECTORY").
+You can change the file name (and location) with the 
+\ref cfg_searchdata_file "SEARCHDATA_FILE" option.
+
+The next step is to put the raw search data into an index for efficient 
+searching. You can use `doxyindexer` for this. Simply run it from the command 
+line:
+
+    doxyindexer searchdata.xml
+
+This will create a directory called `doxysearch.db` with some files in it.
+By default the directory will be created at the location from which doxyindexer
+was started, but you can change the directory using the `-o` option.
+
+Copy the `doxysearch.db` directory to the same directory as where 
+the `doxysearch.cgi` is located and rerun the browser test by pointing 
+the browser to
+
+    http://yoursite.com/path/to/cgi/doxysearch.cgi?test
+
+You should now get the following message:
+
+    Test successful.
+
+Now you should be enable to search for words and symbols from the HTML output.
+
+\subsection extsearch_multi Multi project index
+
+In case you have two doxygen projects A and B where B depends on A via a 
+tag file, i.e. the configuration of project A says:
+
+    GENERATE_TAGFILES = A.tag
+
+and the configuration of project B has its dependency on A configured as 
+follows:
+
+    TAGFILES = ../project_A/A.tag=../../project_A/html
+
+then it may be desirable to allow searching for words in both projects.
+
+To make this possible all that is needed is to combine the search data
+for both projects into one index, i.e. run
+
+    doxyindexer project_A/searchdata.xml project_B/searchdata.xml
+
+and then copy the resulting `doxysearch.db` to the directory where also
+`doxysearch.cgi` used by project B is located.
+
+In case you also want to link to search results in project B 
+from the search page of project A (or in general 
+between two projects that are otherwise unrelated),
+you need to give some additional information in order for doxygen to make 
+the right links. This is what the 
+\ref cfg_extra_search_mappings "EXTRA_SEARCH_MAPPINGS" option is for.
+
+Each project needs to have a tag file defined, i.e. in the above example 
+involving project A and B, also project B should define a tag file:
+
+    GENERATE_TAGFILES = B.tag
+
+then project A can define the mapping as follows:
+
+    EXTRA_SEARCH_MAPPINGS = B.tag=../../project_B/html
+
+with this addition, projects A and B can share the same search database.
+
+@note The mapping defined by `EXTRA_SEARCH_MAPPINGS` is treated as an 
+extension of the mappings already defined by `TAGFILES`. In case the same 
+tag file is mentioned in both options, the one in `TAGFILES` is used.
+
+\section extsearch_update Updating the index
+
+When you modify the source code, you should re-run doxygen to get up to date
+documentation again. When using external searching you also need to update the
+search index by re-running `doxyindexer`. You could wrap the call to doxygen
+and doxyindexer together in a script to make this process easier.
+
+\section extsearch_api Programming interface
+
+Previous sections have assumed you use the tools `doxyindexer` 
+and `doxysearch.cgi` to do the indexing and searching, but you could also 
+write your own index and search tools if you like.
+
+For this 3 interfaces are important
+- The format of the input for the index tool.
+- The format of the input for the search engine.
+- The format of the output of search engine.
+
+The next subsections describe these interfaces in more detail.
+
+\subsection extsearch_api_index Indexer input format
+
+The search data produced by doxygen follows the 
+<a href="http://wiki.apache.org/solr/UpdateXmlMessages">Solr XML index message</a>
+format.
+
+The input for the indexer is an XML file, which consists of one `<add>` tag containing 
+multiple `<doc>` tags, which in turn contain multiple `<field>` tags. 
+
+Here is an example of one doc node, which contains the search data and meta data for 
+one method:
+
+    <add>
+      ...
+      <doc>
+        <field name="type">function</field>
+        <field name="name">QXmlReader::setDTDHandler</field>
+        <field name="args">(QXmlDTDHandler *handler)=0</field>
+        <field name="tag">qtools.tag</field>
+        <field name="url">de/df6/class_q_xml_reader.html#a0b24b1fe26a4c32a8032d68ee14d5dba</field>
+        <field name="keywords">setDTDHandler QXmlReader::setDTDHandler QXmlReader</field>
+        <field name="text">Sets the DTD handler to handler DTDHandler()</field>
+      </doc>
+      ...
+    </add>
+
+Each field has a name. The following field names are supported:
+- *type*: the type of the search entry; can be one of: source, function, slot, 
+          signal, variable, typedef, enum, enumvalue, property, event, related, 
+          friend, define, file, namespace, group, package, page, dir
+- *name*: the name of the search entry; for a method this is the qualified name of the method,
+          for a class it is the name of the class, etc.
+- *args*: the parameter list (in case of functions or methods)
+- *tag*:  the name of the tag file used for this project.
+- *url*:  the (relative) URL to the HTML documentation for this entry.
+- *keywords*: important words that are representative for the entry. When searching for such
+          keyword, this entry should get a higher rank in the search results.
+- *text*: the documentation associated with the item. Note that only words are present, no markup.
+
+@note Due to the potentially large size of the XML file, it is recommended to use a 
+<a href="http://en.wikipedia.org/wiki/Simple_API_for_XML">SAX based parser</a> to process it.
+
+\subsection extsearch_api_search_in Search URL format
+
+When the search engine is invoked from a doxygen generated HTML page, a number of parameters are
+passed to via the <a href="http://en.wikipedia.org/wiki/Query_string">query string</a>.
+
+The following fields are passed:
+- *q*:  the query text as entered by the user
+- *n*:  the number of search results requested.
+- *p*:  the number of search page for which to return the results. Each page has *n* values.
+- *cb*: the name of the callback function, used for JSON with padding, see the next section.
+       
+From the complete list of search results, the range `[n*p - n*(p+1)-1]` should be returned.
+
+Here is an example of how a query looks like.
+
+    http://yoursite.com/path/to/cgi/doxysearch.cgi?q=list&n=20&p=1&cb=dummy
+
+It represents a query for the word 'list' (`q=list`) requesting 20 search results (`n=20`), 
+starting with the result number 20 (`p=1`) and using callback 'dummy' (`cb=dummy`):
+
+
+@note The values are <a href="http://en.wikipedia.org/wiki/Percent-encoding">URL encoded</a> so they
+have to be decoded before they can be used.
+
+\subsection extsearch_api_search_out Search results format
+
+When invoking the search engine as shown in the previous subsection, it should reply with
+the results. The format of the reply is
+<a href="http://en.wikipedia.org/wiki/JSONP">JSON with padding</a>, which is basically
+a javascript struct wrapped in a function call. The name of function should be the name of
+the callback (as passed with the *cb* field in the query).
+
+With the example query as shown the previous subsection the main structure of the reply should
+look as follows:
+
+    dummy({
+      "hits":179,
+      "first":20,
+      "count":20,
+      "page":1,
+      "pages":9,
+      "query": "list",
+      "items":[
+      ...
+     ]})
+
+The fields have the following meaning:
+- *hits*:  the total number of search results (could be more than was requested).
+- *first*: the index of first result returned: \f$\min(n*p,\mbox{\em hits})\f$.
+- *count*: the actual number of results returned: \f$\min(n,\mbox{\em hits}-\mbox{\em first})\f$
+- *page*:  the page number of the result: \f$p\f$
+- *pages*: the total number of pages: \f$\lceil\frac{\mbox{\em hits}}{n}\rceil\f$.
+- *items*: an array containing the search data per result.
+
+Here is an example of how the element of the *items* array should look like:
+
+    {"type": "function",
+     "name": "QDir::entryInfoList(const QString &nameFilter, int filterSpec=DefaultFilter, int sortSpec=DefaultSort) const",
+     "tag": "qtools.tag",
+     "url": "d5/d8d/class_q_dir.html#a9439ea6b331957f38dbad981c4d050ef",
+     "fragments":[
+       "Returns a <span class=\"hl\">list</span> of QFileInfo objects for all files and directories...",
+       "... pointer to a QFileInfoList The <span class=\"hl\">list</span> is owned by the QDir object...",
+       "... to keep the entries of the <span class=\"hl\">list</span> after a subsequent call to this..."
+     ]
+    },
+
+The fields for such an item have the following meaning:
+- *type*: the type of the item, as found in the field with name "type" in the raw search data.
+- *name*: the name of the item, including the parameter list, as found in the fields with
+          name "name" and "args" in the raw search data.
+- *tag*:  the name of the tag file, as found in the field with name "tag" in the raw search data.
+- *url*:  the name of the (relative) URL to the documentation, as found in the field with name "url"
+          in the raw search data.
+- "fragments": an array with 0 or more fragments of text containing words that have been search for.
+          These words should be wrapped in `<span class="hl">` and `</span>` tags to highlight them
+          in the output.
+
+*/
author	Dimitri van Heesch <dimitri@stack.nl>	2012-12-26 15:59:17 (GMT)
committer	Dimitri van Heesch <dimitri@stack.nl>	2012-12-26 15:59:17 (GMT)
commit	48f4de5c47d55b6622b6fdc9b5c288e19d5692f9 (patch)
tree	629c4681a5158d26512b815623754b33165d8d23 /doc/extsearch.doc
parent	fee4053bd3dd075a2dd2cba4da8166ec5307eadd (diff)
download	Doxygen-48f4de5c47d55b6622b6fdc9b5c288e19d5692f9.zip Doxygen-48f4de5c47d55b6622b6fdc9b5c288e19d5692f9.tar.gz Doxygen-48f4de5c47d55b6622b6fdc9b5c288e19d5692f9.tar.bz2