Skip to content

bbloom chat

Joseph Parker edited this page Jan 9, 2017 · 2 revisions

selfsame [2:36 PM] is this what you were thinking of? https://gist.github.com/selfsame/e4757c32c08ff5ab3ace3d9a94b9a0e1

bbloom [2:37 PM] yeah, definitely

[2:37]
i was thinking of Graph implementing ILookup etc to return a node too

[2:37]
graph would basically be a vector

[2:38]
hmmmm the assoc is interesting tho

[2:38]
i wasn’t thinking about that necessarily

[2:39]
it’s not clear what to do there

[2:39]
i think you need some sort of graph-specific update-in

[2:39]
like (update-node graph id assoc :foo 1)

[2:40]
i’d expect associng to a node directly to either A) not work, which is what datomic entities do or B) disassociate the node from the graph

selfsame [2:40 PM] ah interesting

bbloom [2:41 PM] if you are interested in this, i would be very interested in providing design feedback, as this is something i could use 🙂

[2:41]
i mean, if you’re going to make a nice useful lib for me and all 😉

selfsame [2:42 PM] yeah I'm into it, you were thinking it will have a lower memory profile?

bbloom [2:43 PM] i think so, especially if you use generic vectors for attributes

[2:43]
there is a separate discussion about specializing vectors

selfsame [2:44 PM] i think 'nth' is faster than map '-lookup' too IIRC

bbloom [2:44 PM] i would certainly expect it to be

[2:45]
one problem is that keyword lookup is fundamentally dynamic - there’s no way to go from keyword to field offset and then reuse that field offset value to do efficient field access across the board

[2:45]
in theory you could have a defrecord like thing that works w/ this soa representation too

[2:46]
so you could do (.foo (graph id))

[2:47]
and i know you love games, so i should add: i expect soa with records support to be exactly the sort of thing you’d want for all your game objects, etc

[2:47]
😉

selfsame [2:47 PM] yeah i've done a few projects with that mutable array pattern

bbloom [2:48 PM] yeah, i think you need a transient version too

[2:48]
so you’d have like update-node and update-node!

selfsame [2:48 PM] but yeah want something immutable that could work with om or whatever

bbloom [2:48 PM] let me tell you about the dynamic-generic vector idea too

[2:48]
so now you can do (vector-of int) or whatever

[2:49]
but it’s not obvious to me why you can’t do that dynamically and incrementally with the existing tree structure of vectors

[2:49]
so like you have a 32-element array in each node of the vector tree

[2:49]
if you add a new node, you allocate an array of the primitive type of the added element

[2:49]
if you add a new element of a different type, you just promote that array to an array of objects

selfsame [2:50 PM] er.. losing me, you mean like typed fields?

bbloom [2:50 PM] yeah, let me try to explain:

selfsame [2:50 PM] hadn't seen vector-of yet 🙂

bbloom [2:51 PM] https://github.com/clojure/clojure/blob/master/src/clj/clojure/gvec.clj GitHub clojure/clojure clojure - The Clojure programming language

[2:51]
see the “array managers”?

[2:52]
let me explain an example

[2:52]
for simplicity, assume a binary version of a persistent vector: ie 2-element arrays for the nodes

[2:52]
you start with some empty node [] and push a number, now you have [1]

[2:52]
push another number and you get [1, 2]

[2:52]
push a 3rd number, now you overflow your node, so you get 1, 2], [3

[2:53]
with me so far?

selfsame [2:53 PM] yeah

bbloom [2:53 PM] ok so let me label these nodes to talk about them

[2:53]
root:[a:[1, 2], b:[3]]

[2:53]
the type of A is “int node"

[2:53]
the type of B is also int node

[2:53]
the type of root is “interior int node"

[2:54]
now if we push a string: root:[a:[1, 2], b:[3,”x”]]

[2:54]
we basically changed b from a “int node” to an “object node"

[2:54]
but there’s no reason to change a!

[2:54]
root becomes a “interior dynamic node"

[2:54]
this way, you don’t actually need to manually say (vector-of int)

[2:54]
it just happens automatically

selfsame [2:55 PM] ok with you

bbloom [2:55 PM] so my theory is that if you do 1) that, which i’m calling “dynamic generic vectors” and 2) the soa thing, you could make tons of useful clojure programs dramatically faster

[2:56]
i started to do #1, it seemed like it would work, but decided to shelve it as i didn’t critically need it right away

[2:56]
#2 is somethign i’ve only been considering more recently, as i have been working with a lot more graphs

selfsame [2:58 PM] i'll have to read the vector implementation

[2:59]
but is the idea that #1 would be a lot more efficient in memory usage?

bbloom [2:59 PM] yeah, it would be like using an Int32Array instead of a javascript array

[3:00]
only it would be an int32-persistent-vector instead of a normal dynamic clojure persistent vector

[3:00]
with the benefit that you don’t have to manually specify the type up front with vector-of

selfsame [3:01 PM] optional user specified types for #2 fields seems desirable

bbloom [3:01 PM] yeah, totally

[3:02]
the #1 idea is about making it automatic

[3:02]
essentially it would optimistically assume primitive types, unless it encounters an object, then it would promote to that

[3:02]
there’s a #3 idea, which is automatic records 😉

[3:03]
in your gist, that would be adding/removing vectors to the graph

selfsame [3:05 PM] haha yeah not exactly sure about records

[3:05]
almost just want to have keys map to an index of vectors

bbloom [3:06 PM] yeah, that’s what i’d do to start before considering records

[3:07]
i think that if you did this soa thing w/ a decent vector & map implementation, transient versions, change-node/change-node!, etc - you’d have an insanely awesome library for game engine entities

[3:08]
one other problem to consider is sparse vectors

[3:08]
consider if every entity has position x, y, z - sure you want 3 vectors of floats - perfect

[3:08]
but what if some subset of entities have a health int?

[3:08]
surely you don’t want a whole vector of mostly nulls or zeros

[3:09]
in theory, the existing vector tree could easily be made sparse by having null branches

[3:09]
if you make that automatic, then boom, you’re in fucking business for fully persistent/transient entity database of awesomeness

[3:10]
aaaanyway, thanks for letting me pontificate 🙂 i’ve got some code to write totally unrelated to this, lol

selfsame [3:10 PM] cool i'll read through the cljs vector/map impls and get back to you after i think about it a bit

bbloom [3:10 PM] cool, good luck

selfsame [4:37 PM] uploaded this image: graph-ram.png Add Comment

bbloom [4:38 PM] haha am i reading this right?

[4:38]
40mb, 10mb, 3mb? for vec of maps, vec of recs, and then rec of vecs?

selfsame [4:38 PM] yeah

bbloom [4:38 PM] awesome.

[4:41]
that’s pretty crazy - you sure that’s right? i believe it, but wow, that’s a big difference

[4:41]
also, how are you calculating those sizes?

[4:41]
just total heap size on the tests?

selfsame [4:44 PM] yeah i wasn't sure if heap is reliable

bbloom [4:44 PM] i mean, best you’re gonna get from a browser’s profilers

selfsame [4:45 PM] yeah the drill down adds up so guess it's accurate

bbloom [5:03 PM] so does this seem like something that would be genuinely useful?

selfsame [5:08 PM] feel like not with just the memory improvement

bbloom [5:09 PM] no? presumably it would be much faster too. less memory = better cache usage

selfsame [5:11 PM] yeah maybe if it was (nth (nth foo k) i)

bbloom [5:17 PM] no i mean it doens’t even matter what the operations are

[5:17]
if the memory size is smaller, the program will be faster

[5:17]
b/c it doesn’t have to load stuff in to cache as much, which is the slowest thing cpus do

selfsame [5:22 PM] ah gotcha. Personally i'm most interested in the lookup time stuff

bbloom [5:23 PM] i bet you get much faster lookup b/c the one map (the one with the vectors in it) will be cached frequently, so it will be cheap to scan for the vector

selfsame [5:25 PM] incedentally a mutable SoA blows this all out of the water

[5:25]
tempting to just make this copy on write

bbloom [5:25 PM] ha, of course

[5:25]
yeah - that’s when it gets interesting: if you do transients

[5:26]
i think the design of transients is kinda fucked up tho

[5:27]
instead of having to use the return value of the ! methods, it should just be a classic damn mutable structure

[5:27]
and instead of persistent! killing the transient, it should just make it copy-on-write

[5:27]
ie it should be “snapshot"

[5:27]
makes transients much more useful

[5:27]
also, mutable + snapshots is kinda the right default for graphs

[5:28]
but of course you always still want snapshots 🙂

selfsame [5:28 PM] this? https://github.com/clojure/clojurescript/blob/r1.9.293-83-g101d7d9/src/main/cljs/cljs/core.cljs#L5551 GitHub clojure/clojurescript clojurescript - Clojure to JS compiler

bbloom [5:29 PM] yes

[5:29]
this is a design bug: (throw (js/Error. "conj! after persistent!"))

[5:29]
IMO

[5:29]
you don’t need that if you add one more level of indirection

[5:29]
instead of (deftype TransientVector [cnt shift ...

[5:29]
you do (deftype TransientVector [root-node ...

[5:30]
and then root-node gets cnt, shift etc

[5:30]
and you no longer need the throws after persistent!, it’s the same as if you called persistent! and then immediately called transient again


selfsame [5:47 PM] https://github.com/selfsame/soa GitHub selfsame/soa soa - clojure Structure of Arrays data type

[5:47]
give it a look over when you get a chance

bbloom [7:27 PM] is the example supposed to have that outer [] on it?

selfsame [7:28 PM] ah nope

bbloom [7:29 PM] i’m confused about the reduce in the gupdate - why is that necessary?

selfsame [7:32 PM] yeah so that's iterating the [k v] of the updated record to update the Graph record vecs

bbloom [7:32 PM] ah i see - if you update a node, you might add stuff to the whole graph gotcha

[7:32]
hmm interesting

selfsame [7:32 PM] were you thinking more of a (gupdate o id k f) where the key is specified?

bbloom [7:33 PM] i wasn’t thinking that, but now i am 😉

selfsame [7:33 PM] also not sure what do do about changed record signatures, if you assoc or dissoc a key

bbloom [7:34 PM] well you can’t dissoc a basis key from a record

[7:34]
so i’d just follow that same simple design

[7:34]
but additionally assoc’d keys are ok

selfsame [7:35 PM] and that would populate a new key vec to the graph

bbloom [7:35 PM] i guess so

[7:35]
this stuff is tricky b/c graphs have to have both a graph object and the node id/object

[7:35]
it’s a shame that you can’t just reuse update-in or assoc-in

selfsame [7:36 PM] yeah i tried to make the graph associative and update-in wasn't working

[7:36]
but maybe that was on my side

bbloom [7:36 PM] those functions take paths as data vectors, but they are not really general enough for things besides maps/vecs, you kinda want something like (defn update-in [x place f & args] …) where “place” is something smarter than just a vector of keys

[7:36]
but that’s a whole other complex road to go down

selfsame [7:37 PM] so re: earlier about transients

[7:38]
does (transient foo-graph) and then have a gupdate! variant sound reasonable?

bbloom [7:38 PM] i think so

[7:40]
hm, interesting too that the gget operation may return a bunch of nil keys for each element

[7:41]
i wonder if maybe you need a sentinel value for missing keys

[7:41]
i may just shadow the core names for get, assoc, dissoc, btw, then from the outside can use them g/assoc etc

[7:42]
so assuming like (defrecord Point [x y]) and then did:

[7:43]
wait a lick

[7:43]
how is that new syntax working?

[7:43]
i think that’s only working b/c it’s cljs

[7:44]
in clj new needs to take a class name syntactically

selfsame [7:44 PM] haha dang

bbloom [7:44 PM] might be better to give the construction function a prototype object

[7:44]
like assuming (defrecord Point [x y])

[7:44]
you could do:

selfsame [7:44 PM] ok yeah, and it could uses those values as the defaults

bbloom [7:44 PM] (graph 5 (Point. nil nil))

[7:44]
or (graph 5 (Point. 0 0))

[7:44]
yeah

[7:45]
also the record type is probably not even strictly necessary - could also take any old map as a prototype

[7:45]
and {} by default

[7:46]
similarly, probably can get away w/o the size constructor if the graph supported conj

[7:47]
(-> (g/make) (conj {:x 1 :y 2}) (g/assoc 0 :z 3) (g/update 0 :x dec)) => #graph [{:x 0 :y 2 :z 3}]

[7:48]
or #soa

[7:48]
or, if you did the derecord:

[7:49]
(-> (g/make (Point. 0 0)) (conj {:y 1})) => #soa [#Point{:x 0 :y 1}]

[7:49]
assuming the conj does a merge with the default

selfsame [7:50 PM] hmm

[7:51]
yeah with a template you could use maps

bbloom [7:52 PM] even without a template i think you could use maps - you just extend it when you do a g/assoc

selfsame [7:52 PM] but the type of the graph rec is the type of the gget

[7:52]
maybe that's not really important though

bbloom [7:52 PM] yeah, instead of type + map of vecs, you’d have a prototype + map of vecs

[7:53]
you can even use the prototype to create the map of vecs

[7:53]
assuming you don’t have type hints on the fields

[7:54]
which i guess you never really do in cljs

selfsame [7:56 PM] yeah graph just transforms the template into the map of vecs

Clone this wiki locally