An interface to functions to manipulate databases.
- data Database
- data EstError
- data AttrIndexType
- data OptimizeOption
- = NoPurge
- | NoDBOptimize
- data RemoveOption = CleaningRemove
- data PutOption
- data GetOption
- = NoAttributes
- | NoText
- | NoKeywords
- data OpenMode
- = Reader [ReaderOption]
- | Writer [WriterOption]
- data ReaderOption = ReadLock LockingMode
- data WriterOption
- = Create [CreateOption]
- | Truncate [CreateOption]
- | WriteLock LockingMode
- data LockingMode
- data CreateOption
- data AnalysisOption
- data IndexTuning
- data ScoreOption
- withDatabase :: FilePath -> OpenMode -> (Database -> IO a) -> IO a
- openDatabase :: FilePath -> OpenMode -> IO (Either EstError Database)
- closeDatabase :: Database -> IO ()
- addAttrIndex :: Database -> Text -> AttrIndexType -> IO ()
- flushDatabase :: Database -> Int -> IO ()
- syncDatabase :: Database -> IO ()
- optimizeDatabase :: Database -> [OptimizeOption] -> IO ()
- mergeDatabase :: Database -> FilePath -> [RemoveOption] -> IO ()
- setCacheSize :: Database -> Int -> Int -> Int -> Int -> IO ()
- putDocument :: Database -> Document -> [PutOption] -> IO ()
- removeDocument :: Database -> DocumentID -> [RemoveOption] -> IO ()
- updateDocAttrs :: Database -> Document -> IO ()
- getDocument :: Database -> DocumentID -> [GetOption] -> IO Document
- getDocAttr :: Database -> DocumentID -> Text -> IO (Maybe Text)
- getDocURI :: Database -> DocumentID -> IO URI
- getDocIdByURI :: Database -> URI -> IO (Maybe DocumentID)
- getDatabaseName :: Database -> IO Text
- getNumOfDocs :: Database -> IO Int
- getNumOfWords :: Database -> IO Int
- getDatabaseSize :: Database -> IO Integer
- hasFatalError :: Database -> IO Bool
- searchDatabase :: Database -> Condition -> IO [DocumentID]
- searchDatabase' :: Database -> Condition -> IO ([DocumentID], [(Text, Int)])
- metaSearch :: [Database] -> Condition -> IO [(Database, DocumentID)]
- metaSearch' :: [Database] -> Condition -> IO ([(Database, DocumentID)], [(Text, Int)])
- scanDocument :: Database -> Document -> Condition -> IO Bool
Types
data EstError
EstError
represents an error occured on various operations.
InvalidArgument | An argument passed to the function was invalid. |
AccessForbidden | The operation is forbidden. |
LockFailure | Failed to lock the database. |
DatabaseProblem | The database has a problem. |
IOProblem | An I/O operation failed. |
NoSuchItem | An object you specified does not exist. |
MiscError | Errors for other reasons. |
data AttrIndexType
AttrIndexType
represents an index type for an attribute.
SeqIndex | Map from a document ID to an attribute value. This
type of index increses the efficiency of, say,
|
StrIndex | Map from an attribute value to a document ID. This increases the search speed when you search for documents by an attribute value. |
NumIndex | This is similar to |
data OptimizeOption
OptimizeOption
is an option for the optimizeDatabase
action.
NoPurge | Omit the process which purges garbages of removed documents. |
NoDBOptimize | Omit the process which optimizes the database file. |
data RemoveOption
RemoveOption
is an option for the mergeDatabase
action and the
removeDocument
action.
CleaningRemove | Clean up the region in the database where the removed documents were placed. |
data PutOption
PutOption
is an option for the putDocument
action.
CleaningPut | If the new document overwrites an old one, clean up the region in the database where the old document were placed. |
WeightStatically | Statically apply the "@weight" attribute of the document. |
data GetOption
GetOption
is an option for the getDocument
action.
NoAttributes | Don't retrieve the attributes of the document. |
NoText | Don't retrieve the body of the document. |
NoKeywords | Don't retrieve the keywords of the document. |
data OpenMode
OpenMode
represents how to open a database.
Reader [ReaderOption] | Open the database with read-only
mode. You can specify |
Writer [WriterOption] | Open the database with writable
mode. You can specify |
data ReaderOption
ReaderOption
is an option for the Reader
constructor.
ReadLock LockingMode | Specify how to lock the database. |
data WriterOption
WriterOption
is an option for the Writer
constructor.
Create [CreateOption] | Create a database if an old one
doesn't exist. You can specify
|
Truncate [CreateOption] | Always create a new database even
if an old one already exists. You
can specify |
WriteLock LockingMode | Specify how to lock the database. |
data LockingMode
LockingMode
represents how to lock the database.
NoLock | Do no exclusive access control at all. This option is very unsafe. |
NonblockingLock | Do non-blocking lock. (The author of this module doesn't know what happens if this option is in effect. See the manual and the source code of HyperEstraier and QDBM.) |
data CreateOption
CreateOption
is an option for the Create
constructor.
Analysis AnalysisOption | Specify the word analysis method. |
Index IndexTuning | Specify the prospective size of the database. |
Score [ScoreOption] | Specify how to handle scores of the documents. |
data AnalysisOption
AnalysisOption
is an option for the Analysis
constructor.
PerfectNGram | Use the perfect N-gram analyzer. |
CharCategory | Use the character category analyzer. |
data IndexTuning
IndexTuning
is an option for the Index
constructor.
Small | Predict the database will have less than 50,000 documents. |
Large | Predict the database will have less than 300,000 documents. |
Huge | Predict the database will have less than 1,000,000 documents. |
Huge2 | Predict the database will have less than 5,000,000 documents. |
Huge3 | Predict the database will have more than 10,000,000 documents. |
data ScoreOption
ScoreOption
is an option for the Score
constructor.
Nullified | Nullify anything about the score of documents. |
StoredAsInt | Store the scores for documents into the database as 32-bit integer. |
OnlyToBeStored | Store the scores for documents into the database but don't use them during the search operation. |
Opening and closing databases
withDatabase :: FilePath -> OpenMode -> (Database -> IO a) -> IO a
opens a database at withDatabase
fpath mode ffpath
and
compute f
. When the action f
finishes or throws an exception,
the database will be closed automatically. If withDatabase
fails
to open the database, it throws an EstError
. See openDatabase
.
openDatabase :: FilePath -> OpenMode -> IO (Either EstError Database)
opens a database at openDatabase
fpath modefpath
. If it
succeeds it returns
, otherwise it
returns Right
Database
.
Left
EstError
The Database
can be shared by multiple threads, but there is one
important limitation in the current implementation of the
HyperEstraier itself. /A single process can NOT open the same
database twice simultaneously./ Such attempt results in
AccessForbidden
.
closeDatabase :: Database -> IO ()
closes the database closeDatabase
dbdb
. If the db
has
already been closed, this operation causes nothing.
Manipulating database
addAttrIndex :: Database -> Text -> AttrIndexType -> IO ()
creates an index of type
addAttrIndex
db attr idxTypeidxType
for attribute attr
into the database db
.
flushDatabase :: Database -> Int -> IO ()
flushes at most flushDatabase
db numWordsnumWords
index
words in the cache of the database db
. If numWords <= 0
all the
index words will be flushed.
syncDatabase :: Database -> IO ()
Synchronize a database to the disk.
optimizeDatabase :: Database -> [OptimizeOption] -> IO ()
Optimize a database.
mergeDatabase :: Database -> FilePath -> [RemoveOption] -> IO ()
merges another database at mergeDatabase
db fpath optsfpath
(source) to the db
(destination). The flags of the two databases
must be the same. If any documents in the source database have the
same URI as the documents in the destination, those documents in
the destination will be overwritten.
:: Database | The database. |
-> Int | Maximum size of the index cache. (default: 64 MiB) |
-> Int | Maximum records of cached attributes. (default: 8192 records) |
-> Int | Maximum number of cached document text. (default: 1024 documents) |
-> Int | Maximum number of the cached search results. (default: 256 records) |
-> IO () |
Change the size of various caches of a database. Passing negative values leaves the old values unchanged.
Getting documents in and out
putDocument :: Database -> Document -> [PutOption] -> IO ()
Put a document into a database. The document must have an
"@uri"
attribute. If the database already has a document whose
URI is the same as of the new document, the old one will be
overwritten. See setURI
and
updateDocAttrs
.
removeDocument :: Database -> DocumentID -> [RemoveOption] -> IO ()
Remove a document from a database.
updateDocAttrs :: Database -> Document -> IO ()
Update attributes of a document in a database. The document to be
updated is determined by the document ID. It is an error to change
the URI of the document to be the same as of one of existing
documents. Note that the document body will not be updated. See
putDocument
.
getDocument :: Database -> DocumentID -> [GetOption] -> IO Document
Find a document in a database by an ID.
getDocAttr :: Database -> DocumentID -> Text -> IO (Maybe Text)
Get an attribute of a document in a database.
getDocURI :: Database -> DocumentID -> IO URI
Get the URI of a document in a database.
getDocIdByURI :: Database -> URI -> IO (Maybe DocumentID)
Find a document in a database by an URI and return its ID.
Statistics of databases
getDatabaseName :: Database -> IO Text
Get the name of a database.
getNumOfDocs :: Database -> IO Int
Get the number of documents in a database.
getNumOfWords :: Database -> IO Int
Get the number of words in a database.
getDatabaseSize :: Database -> IO Integer
Get the size of a database.
hasFatalError :: Database -> IO Bool
Return True
iff the document has a fatal error.
Searching for documents
searchDatabase :: Database -> Condition -> IO [DocumentID]
Search for documents in a database by a condition.
searchDatabase' :: Database -> Condition -> IO ([DocumentID], [(Text, Int)])
Search for documents in a database by a condition. The second item of the resulting tuple is a map from each search words to the number of documents which are matched to the word.
metaSearch :: [Database] -> Condition -> IO [(Database, DocumentID)]
Search for documents in many databases at once.
metaSearch' :: [Database] -> Condition -> IO ([(Database, DocumentID)], [(Text, Int)])
Search for documents in many databases at once. The second item of the resulting tuple is a map from each search words to the number of documents which are matched to the word.
scanDocument :: Database -> Document -> Condition -> IO Bool
Check if a document matches to every phrases in a condition.
To be honest with you, the author of this binding doesn't really
know what est_db_scan_doc()
does. Its documentation is way too
ambiguous across the board. Moreover, the names of symbols of the
HyperEstraier are very badly named. Can you imagine what, say
est_db_out_doc()
does? How about the constant named
ESTCONDSURE
? The author got tired of examining the commentless
source code over and over again to write this binding. Its
functionality is awesome though...