Fix some typos (mostly found by codespell)#8
Open
stweil wants to merge 2 commits intoDoreenruirui:masterfrom
Open
Fix some typos (mostly found by codespell)#8stweil wants to merge 2 commits intoDoreenruirui:masterfrom
stweil wants to merge 2 commits intoDoreenruirui:masterfrom
Conversation
stweil
commented
Sep 19, 2019
| </center> | ||
|
|
||
| **Okralact** is both a set of specifications and a prototype implementation for harmonizing the input data, parameterization and provenance tracking of training different OCR engines. It is a client/server architecture application. The interactions between the client nodes and the server are implementeqd using **Flask**, a lightweight web application framework for Python. All the training or evaluation jobs submitted to the server are handled in the background by task queues implemented wth **Redis Queue** (**RQ**). | ||
| **Okralact** is both a set of specifications and a prototype implementation for harmonizing the input data, parameterization and provenance tracking of training different OCR engines. It is a client/server architecture application. The interactions between the client nodes and the server are implemented using **Flask**, a lightweight web application framework for Python. All the training or evaluation jobs submitted to the server are handled in the background by task queues implemented with **Redis Queue** (**RQ**). |
stweil
commented
Sep 19, 2019
| ``` | ||
|
|
||
| Adds either an LSTM or GRU recurrent layer to the network using eiter the x (width) or y (height) dimension as the time axis. Input features are the channel dimension and the non-time-axis dimension (height/width) is treated as another batch dimension. For example, a Lfx25 layer on an 1, 16, 906, 32 input will execute 16 independent forward passes on 906x32 tensors resulting in an output of shape 1, 16, 906, 25. If this isn’t desired either run a summarizing layer in the other direction, e.g. Lfys20 for an input 1, 1, 906, 20, or prepend a reshape layer S1(1x16)1,3 combining the height and channel dimension for an 1, 1, 906, 512 input to the recurrent layer. | ||
| Adds either an LSTM or GRU recurrent layer to the network using either the x (width) or y (height) dimension as the time axis. Input features are the channel dimension and the non-time-axis dimension (height/width) is treated as another batch dimension. For example, a Lfx25 layer on an 1, 16, 906, 32 input will execute 16 independent forward passes on 906x32 tensors resulting in an output of shape 1, 16, 906, 25. If this isn’t desired either run a summarizing layer in the other direction, e.g. Lfys20 for an input 1, 1, 906, 20, or prepend a reshape layer S1(1x16)1,3 combining the height and channel dimension for an 1, 1, 906, 512 input to the recurrent layer. |
stweil
commented
Sep 19, 2019
| | Linear | | same with Kraken | output size <br>01c\<s\> | | ❌ | | ||
| | Modify Top Layers | | —append_index:[int,-1]cut the head off the network at the given index and append —net_spec network in place of cut off part | -a, --append INTEGER: remove layers before argument and then appends spec. Only work when loading an existing model | | ? | | ||
| | Loading Existing Model | | --continue_from[string, none):path to previous checkpoint from which to continue training or fine tune(training checkpoint or a recognition mdoel)<br>--stop_training[false): convert the training checkpoint in --continue_from to a recognition model<br>—convert_to_int[bool, false]: when using stop_training, convert to 8-bit integer for greater speed, with slightly less accuracy | -i, --load PATH: Load existing file to continue training | | --weights WEIGHTS, string<br>Load network weights from the given file | | ||
| | Loading Existing Model | | --continue_from[string, none):path to previous checkpoint from which to continue training or fine tune(training checkpoint or a recognition model)<br>--stop_training[false): convert the training checkpoint in --continue_from to a recognition model<br>—convert_to_int[bool, false]: when using stop_training, convert to 8-bit integer for greater speed, with slightly less accuracy | -i, --load PATH: Load existing file to continue training | | --weights WEIGHTS, string<br>Load network weights from the given file | |
stweil
commented
Sep 19, 2019
| | Preload Data into Memory | | ❌ | --preload / --no-preload: Enables/disables preloading of the training set into memory for accelerated training. The default setting preloads data sets with less than 2500 lines, explicitly adding `--preload` will preload arbitrary sized sets. `--no-preload` disables preloading in all circumstances. | ❌ | —train_data_on_the_fly: Instead of preloading all data during the training, load the data on the fly. This is slower, but might be required for limited RAM or large datasets. —validation_data_on_the_fly:Instead of preloading all data during the training, load the data on the fly. This is slower, but might be required for limited RAM or large datasets | | ||
| | Number of openMP threads | | ❌ | --threads INTEGER[1]: Number of OpenMP threads and workers when running on CPU. | ❌ | --num_threads NUM_THREADS:The number of threads to use for all operations. —num_inter_threads,int, [0]"Tensorflow's session inter threads param") --num_intra_threads, int, [0], Tensorflow's session intra threads param | | ||
| | Special | | --max_image_MB, int[6000], maximum amount of memory to use for caching images<br>--perfect_sample_delay, int[0]: When the network gets good, only backprop a perfect sample after this many imperfect samples have been seen since the last perfect sample was allowed through.<br>--sequential_training:[bool, false], true for sequential training. Default to process all training data in round-robin fashion.<br>—traineddata[string,none]path to the starter trained data file that contains the unicharset, recorder and optional language model<br>—debug_interval[int,0]:If non-zero, show visual debugging every this many iterations. | -d, --device TEXT [cpu]:Select device to use (cpu, cuda:0, cuda:1, …)<br> | --start START[-1]:manually set the number of already learned lines, which influences the naming and stoping condition, default: -1 which will then be overriden by the value saved in the network:question: | —no_skip_invalid_gt, Do no skip invalid gt, instead raise an exception.<br>--gradient_clipping_mode GRADIENT_CLIPPING_MODE, Clipping mode of gradients. Defaults to AUTO, possible values are AUTO, NONE, CONSTANT. --gradient_clipping_const GRADIENT_CLIPPING_CONST:Clipping constant of gradients in CONSTANT mode.<br>--gt_extension GT_EXTENSION: Default extension of the gt files (expected to exist in same dir)<br> | | ||
| | Special | | --max_image_MB, int[6000], maximum amount of memory to use for caching images<br>--perfect_sample_delay, int[0]: When the network gets good, only backprop a perfect sample after this many imperfect samples have been seen since the last perfect sample was allowed through.<br>--sequential_training:[bool, false], true for sequential training. Default to process all training data in round-robin fashion.<br>—traineddata[string,none]path to the starter trained data file that contains the unicharset, recorder and optional language model<br>—debug_interval[int,0]:If non-zero, show visual debugging every this many iterations. | -d, --device TEXT [cpu]:Select device to use (cpu, cuda:0, cuda:1, …)<br> | --start START[-1]:manually set the number of already learned lines, which influences the naming and stopping condition, default: -1 which will then be overridden by the value saved in the network:question: | —no_skip_invalid_gt, Do no skip invalid gt, instead raise an exception.<br>--gradient_clipping_mode GRADIENT_CLIPPING_MODE, Clipping mode of gradients. Defaults to AUTO, possible values are AUTO, NONE, CONSTANT. --gradient_clipping_const GRADIENT_CLIPPING_CONST:Clipping constant of gradients in CONSTANT mode.<br>--gt_extension GT_EXTENSION: Default extension of the gt files (expected to exist in same dir)<br> | |
Author
|
Ping? |
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Author
|
@Doreenruirui, please merge. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Signed-off-by: Stefan Weil sw@weilnetz.de