Skip to content

Documentation/Example of Input Pipeline #511

@tomblaze

Description

@tomblaze

Hello, thanks for this project. I've been trying to figure out how to input custom data without using a feed_dict like object which is generally considered inefficient.

In issue #93 it seems that there was a decision to not document QueueRunner but it seems a bit unclear to me that the DataLoader() object in the MNIST example is using a tf.data interface and the only documentation on I/O still seems to point to QueueRunner functions.

I just want to accomplish a simple task of queuing data that will arrive as a series of arrays of integers. I tried to implement my own Queue but have failed to get it to run. Would very much appreciate some help getting this to work. Thanks!

Current attempt:

using TensorFlow
using ResumableFunctions #possible generator.

@resumable function array_iterator()
	while true #imagine looping over file until end of training
                #just generate random data for now
		cur_ar = [convert(Int32, rand(1:10)) for x in 1:50]
		@yield cur_ar #convert(Tensor, cur_ar) #also doesn't work.
	end
end

function test()
	batch_size = 4

        #not sure how to pass in shape, have tried several variants
	queue = TensorFlow.FIFOQueue(60, [Int32], shapes=[[50, 1]])
	enqueue_op = TensorFlow.enqueue(queue, [collect(x) for x in array_iterator()])
	batch = TensorFlow.dequeue_many(queue, batch_size, name = "dequeue")
	runner = QueueRunner(queue, [enqueue_op])
	add_queue_runner(runner)

	sess = TensorFlow.Session()

	start_queue_runners(sess)

	for i in range(5) #just print out a few examples.
		println(run(sess, batch))
	end

	clear_queue_runners(sess)

end

test()

EDIT:

julia> tf_versioninfo()
Wording: Please copy-paste the entirely of the below output into any bug reports.
Note that this may display some errors, depending upon on your configuration. This is fine.

----------------
Library Versions
----------------
Trying to evaluate ENV["TF_USE_GPU"] but got error: KeyError("TF_USE_GPU")
ENV["LIBTENSORFLOW"] = /home/tb/tf_lib/tensorflow/bazel-bin/tensorflow/libtensorflow.so

tf_version(kind=:backend) = 1.14.1
Trying to evaluate tf_version(kind=:python) but got error: RemoteException(2, CapturedException(ErrorException("The Python TensorFlow package could not be imported. You must install Python TensorFlow before using this package."), Any[(error at error.jl:33, 1), (init at py.jl:14, 1), (#4 at TensorFlow.jl:163, 1), (#116 at process_messages.jl:276, 1), (run_work_thunk at process_messages.jl:56, 1), (run_work_thunk at process_messages.jl:65, 1), (#102 at task.jl:259, 1)]))
tf_version(kind=:julia) = 0.11.0

-------------
Python Status
-------------
PyCall.conda = true
Trying to evaluate ENV["PYTHON"] but got error: KeyError("PYTHON")
PyCall.PYTHONHOME = /home/tb/.julia/conda/3:/home/tb/.julia/conda/3
String(read(#= /home/tb/.julia/packages/TensorFlow/q9pY2/src/version.jl:104 =# (Core.:(@cmd))("pip --version"))) = pip 18.0 from /usr/local/lib/python2.7/dist-packages/pip (python 2.7)

pyenv: pip3: command not found

The `pip3' command exists in these Python versions:
  3.7.2

Trying to evaluate String(read(#= /home/tb/.julia/packages/TensorFlow/q9pY2/src/version.jl:105 =# (Core.:(@cmd))("pip3 --version"))) but got error: ErrorException("failed process: Process(`pip3 --version`, ProcessExited(127)) [127]")

------------
Julia Status
------------
Julia Version 1.1.0
Commit 80516ca202 (2019-01-21 21:24 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, haswell)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions