Search code examples
c++ocrtesseract

The semantics of TessBaseAPI::Clear()


Suppose I've created two objects of TessBaseAPIxapi and yapi — initialized by calling the following overload of Init() function:

int Init(const char * datapath,
         const char * language,
         OcrEngineMode  oem,
         char **    configs,
         int    configs_size,
         const GenericVector< STRING > *    vars_vec,
         const GenericVector< STRING > *    vars_values,
         bool   set_only_non_debug_params 
);

passing exactly identical arguments.

Since the objects are initialized with identical arguments, at this point xapi and yapi are assumed to be identical from behavioral1 perspective. Is my assumption correct? I hope so, as I don't find any reason for the objects to be non-identical.


Now I'm going to use xapi to extract information from an image but before that I call SetVariable() a number of times, to set few more configurations.

bool SetVariable(const char * name, const char * value);

and then I used xapi to extract some text from an image. Once I'm done with the extraction, I did this:

 xapi.Clear(); //what exactly happens here?

After the call to Clear(), can I use xapi and yapi interchangeably? In other words, can I assume that xapi and yapi are identical at this point from behavioral1 perspective? Can I say Clear() is actually a reset functionality?

1. By "behavioral", I meant performance in terms of accuracy, not speed/latency.


Solution

  • Since the objects are initialized with identical arguments, at this point xapi and yapi are assumed to be identical from behavioral perspective. Is my assumption correct?

    From the outset there is nothing I can find to dispute this assumption.

    Investigating the source code.

    The following parameters are cleared or reset (if you will):

    When calling Clear() the following are called:

    01402 void TessBaseAPI::Clear() {
    01403   if (thresholder_ != NULL)
    01404     thresholder_->Clear();
    01405   ClearResults();
    01406 }
    

    Calling thresholder_->Clear(); destroys the pix (if not null)

    00044 // Destroy the Pix if there is one, freeing memory.
    00045 void ImageThresholder::Clear() {
    00046   if (pix_ != NULL) {
    00047     pixDestroy(&pix_);
    00048     pix_ = NULL;
    00049   }
    00050   image_data_ = NULL;
    00051 }
    

    For Clear Results, as shown below.

    01641 void TessBaseAPI::ClearResults() {
    01642   if (tesseract_ != NULL) {
    01643     tesseract_->Clear();
    01644   }
    01645   if (page_res_ != NULL) {
    01646     delete page_res_;
    01647     page_res_ = NULL;
    01648   }
    01649   recognition_done_ = false;
    01650   if (block_list_ == NULL)
    01651     block_list_ = new BLOCK_LIST;
    01652   else
    01653     block_list_->clear();
    01654   if (paragraph_models_ != NULL) {
    01655     paragraph_models_->delete_data_pointers();
    01656     delete paragraph_models_;
    01657     paragraph_models_ = NULL;
    01658   }
    01659 }
    

    The page results, block list are set to null, along with associated flags being reset.

    tesseract_->Clear() releases the following:

    00413 void Tesseract::Clear() {
    00414   pixDestroy(&pix_binary_);
    00415   pixDestroy(&cube_binary_);
    00416   pixDestroy(&pix_grey_);
    00417   pixDestroy(&scaled_color_);
    00418   deskew_ = FCOORD(1.0f, 0.0f);
    00419   reskew_ = FCOORD(1.0f, 0.0f);
    00420   splitter_.Clear();
    00421   scaled_factor_ = -1;
    00422   ResetFeaturesHaveBeenExtracted();
    00423   for (int i = 0; i < sub_langs_.size(); ++i)
    00424     sub_langs_[i]->Clear();
    00425 }
    

    Noteworthy, SetVariable does not affect init values:

    Only works for non-init variables (init variables should be passed to Init()).

    00143 bool TessBaseAPI::SetVariable(const char* name, const char* value) {
    00144   if (tesseract_ == NULL) tesseract_ = new Tesseract;
    00145   return ParamUtils::SetParam(name, value, SET_PARAM_CONSTRAINT_NON_INIT_ONLY,
    00146                               tesseract_->params());
    00147 }
    

    After the call to Clear(), can I use xapi and yapi interchangeably?

    No. Certainly not if you used a thresholder.

    Can I say Clear() is actually a reset functionality?

    Not in the sense of restoring it to it's initialised state. It will change some values of the original object to null. It will keep the grunt work of parameters like const char * datapath, const char * language, OcrEngineMode oem,. It seems to be a way to free memory without obliterating the object. Inline with "without actually freeing any recognition data that would be time-consuming to reload.".

    After calling Clear() call either SetImage or TesseractRect before using Recognition or Get* functions.

    Clear will not dispose of the SetVariables, they will only be reset to default upon destruction of the object by calling End().

    Looking at the TessbaseApi() class, you can see what you are initialising and which of these values will be reset using Clear().

    00091 TessBaseAPI::TessBaseAPI()
    00092   : tesseract_(NULL),
    00093     osd_tesseract_(NULL),
    00094     equ_detect_(NULL),
    00095     // Thresholder is initialized to NULL here, but will be set before use by:
    00096     // A constructor of a derived API,  SetThresholder(), or
    00097     // created implicitly when used in InternalSetImage.
    00098     thresholder_(NULL),
    00099     paragraph_models_(NULL),
    00100     block_list_(NULL),
    00101     page_res_(NULL),
    00102     input_file_(NULL),
    00103     output_file_(NULL),
    00104     datapath_(NULL),
    00105     language_(NULL),
    00106     last_oem_requested_(OEM_DEFAULT),
    00107     recognition_done_(false),
    00108     truth_cb_(NULL),
    00109     rect_left_(0), rect_top_(0), rect_width_(0), rect_height_(0),
    00110     image_width_(0), image_height_(0) {
    00111 }
    

    Given that the base constructor for the class is:

    (datapath, language, OEM_DEFAULT, NULL, 0, NULL, NULL, false);
    

    These three parameters are always needed, which makes sense.

    If the datapath, OcrEngineMode or the language have changed - start again.
    Note that the language_ field stores the last requested language that was initialized successfully, while tesseract_->lang stores the language actually used. They differ only if the requested language was NULL, in which case tesseract_->lang is set to the Tesseract default ("eng").