Skip to content

training-set options doesn't work #5

@KGOH

Description

@KGOH

In synaptic/datasets.clj there is miisteke with options pass:

(defn training-set
  "Create a training set from samples and associated labels.
  The training set consists of one or more batches and optionally a validation set.
  It also has a map that will allow converting y's back to the original labels.
  
  Options:
    :name        - a name for the training set
    :type        - the type of training data (e.g. :binary-image, :grayscale-image ...)
    :fieldsize   - [width height] of each sample data (for images)
    :nvalid      - size of the validation set (default is 0, i.e. no validation set)
    :batch       - size of a mini-batch (default is the number of samples, after
                   having set apart the validation set)
    :online true - set this flag for online training (same as batch size = 1)
    :rand false  - unset this flag to keep original ordering (by default, samples
                   will be shuffled before partitioning)."
  [samples labels & [options]]
  {:pre [(= (count samples) (count labels))]}
  (let [batchsize  (if (:online options) 1 (:batch options))
        trainsize  (if (:nvalid options) (- (count samples) (:nvalid options)))
        randomize  (if (nil? (:rand options)) true (:rand options))
        [binlb uniquelb]    (u/tobinary labels)
        [smp lb]   (if randomize (shuffle-vecs samples binlb) [samples binlb])
        [trainsmp validsmp] (if trainsize (split-at trainsize smp) [smp nil])
        [trainlb  validlb]  (if trainsize (split-at trainsize lb) [lb nil])
        [batchsmp batchlb]  (partition-vecs batchsize trainsmp trainlb)
        trainsets  (mapv dataset batchsmp batchlb)
        validset   (if trainsize (dataset validsmp validlb))
        timestamp  (System/currentTimeMillis)
        header     {:name (or (:name options) timestamp)
                    :timestamp timestamp
                    :type (:type options)
                    :fieldsize (or (:fieldsize options)
                                   (u/divisors (count (first samples))))
                    :batches (mapv (partial count-labels uniquelb) batchlb)
                    :valid (count-labels uniquelb validlb)
                    :labels uniquelb}]
    (TrainingSet. header trainsets validset)))

Arguments for this function are [samples labels & [options]]
but must be [samples labels & options]
and in the first let as first assignment you must add options (apply hash-map options), so options in function will work. But now instead options you taking only first keyword

Here is code updated by me:

(defn training-set
  "Create a training set from samples and associated labels.
  The training set consists of one or more batches and optionally a validation set.
  It also has a map that will allow converting y's back to the original labels.
  
  Options:
    :name        - a name for the training set
    :type        - the type of training data (e.g. :binary-image, :grayscale-image ...)
    :fieldsize   - [width height] of each sample data (for images)
    :nvalid      - size of the validation set (default is 0, i.e. no validation set)
    :batch       - size of a mini-batch (default is the number of samples, after
                   having set apart the validation set)
    :online true - set this flag for online training (same as batch size = 1)
    :rand false  - unset this flag to keep original ordering (by default, samples
                   will be shuffled before partitioning)."
  [samples labels & options]
  {:pre [(= (count samples) (count labels))]}
  (let [options (apply hash-map options)
        batchsize  (if (:online options) 1 (:batch options))
        trainsize  (if (:nvalid options) (- (count samples) (:nvalid options)))
        randomize  (if (nil? (:rand options)) true (:rand options))
        [binlb uniquelb]    (u/tobinary labels)
        [smp lb]   (if randomize (shuffle-vecs samples binlb) [samples binlb])
        [trainsmp validsmp] (if trainsize (split-at trainsize smp) [smp nil])
        [trainlb  validlb]  (if trainsize (split-at trainsize lb) [lb nil])
        [batchsmp batchlb]  (partition-vecs batchsize trainsmp trainlb)
        trainsets  (mapv dataset batchsmp batchlb)
        validset   (if trainsize (dataset validsmp validlb))
        timestamp  (System/currentTimeMillis)
        header     {:name (or (:name options) timestamp)
                    :timestamp timestamp
                    :type (:type options)
                    :fieldsize (or (:fieldsize options)
                                   (u/divisors (count (first samples))))
                    :batches (mapv (partial count-labels uniquelb) batchlb)
                    :valid (count-labels uniquelb validlb)
                    :labels uniquelb}]
    (TrainingSet. header trainsets validset)))

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions