## Unlimited Detail Technology?

**category:**offtopic [glöplog]

Could somebody, please, implement this:

or this

I did try (and many other variants) but the results are no good: http://www.gamedev.ru/files/?id=81959

Hence this request. Apologies to gamedev.ru's members for bringing this up here.

**Code:**

```
/* Octree nodes are the buckets (internal nodes have <= 8 buckets: their children);
frustum octants are the things being sorted.
The hierarchy of squares is given by a 1 bit per node complete quadtree stored in level order (hence, father-children relationships are "computable").
The hierarchy of cubes is a sparse octree.
Call bucket_sort with cube = root node and stack = (view frustum, viewport) */
bucket_sort(cube, stack)
{
if (cube is a point)
do {
(square, frustum) = pop(stack)
if (!square.black) {
square.color = cube.color
do {
square.black = true
if (square is root)
return
square = square.father
} while (square[XY].black for all XY in { LU, RU, LD, RD })
}
} while (stack)
else {
do {
(square, frustum) = pop(stack)
if (!square.black) {
if (frustum in cube[XYZ])
push((square, frustum), stack[XYZ])
else /* octosect frustum, quadrisect square */
for (XY in { LU, RU, LD, RD }) /* LU: left-up quadrant, etc. */
if (!square[XY].black)
for (Z in { F, B }) /* F: front, B: back */
push((square[XY], frustum[XYZ]), stack)
}
} while (stack)
for (XYZ in front_to_back) /* depends on the eye's octant */
if (stack[XYZ]) /* if subcube "cube[XYZ]"'s stack non-empty */
bucket_sort(cube[XYZ], stack[XYZ])
}
}
```

or this

**Code:**

```
/* Frusta are the discriminants. Classical Bresenham traverses a raster/grid; this Bresenham traverses a hierarchy. Here the other dimension of space is the major axis: scale. */
hierarchical_bresenham(cube, frustum, square)
{
if (cube is a point) { /* leaf node */
square.color = cube.color
square.black = true /* i.e., is opaque */
} else { /* non-leaf */
if (frustum in cube[XYZ]) /* descend in non-empty subcube "cube[XYZ]" */
hierarchical_bresenham(cube[XYZ], frustum, square)
else { /* octosect frustum, quadrisect square */
for (Z in { F, B }) /* front frusta first */
for (XY in { LU, RU, LD, RD })
if (!square[XY].black) /* if subsquare XY not filled */
hierarchical_bresenham(cube, frustum[XYZ], square[XY])
if (square[XY].black for all XY in { LU, RU, LD, RD })
square.black = true /* the region is opaque */
}
}
}
```

I did try (and many other variants) but the results are no good: http://www.gamedev.ru/files/?id=81959

Hence this request. Apologies to gamedev.ru's members for bringing this up here.

The somelike that UD algo has been worked out a long time ago

check here:

http://www.youtube.com/watch?v=W-hOoBa1hxo&feature=related

and

and my own here :

http://www.youtube.com/watch?v=tyxO3SIAJTs

check here:

http://www.youtube.com/watch?v=W-hOoBa1hxo&feature=related

and

and my own here :

http://www.youtube.com/watch?v=tyxO3SIAJTs

Yes, and we're still hoping that you'll re-implement it :-)

hehe I am far too lazy for that dood :) it's a few hours work at most - 'cmon chop chop - get coding man!

Your hierarchical model to view transform is, let me insist, very good. Still, there's that nasty / at the end. UD claims that there are no * or / in the core loop. What is proposed above accomplishes just that.

Interesting - the / isn't that expensive 27 cycles max on a modern cpu?

that Current algo - can cull a huge amount of nodes with 1 divide - that's the real trick of it, not the delta changes that get passed down.

Your algo seems interesting but I don't quite understand at first read. I did try some sorting with a stack at first attempt but it was waay to slow. If I get time i'll have a bash at ur algo. The main reason I'm not re coding the current algo it - it's basically useless in todays tech era - the power of shaders for lighting - which the absolute most important thing nowadays, just over shadows any geometry drawing speed - even tho u can draw fast with that algo - you're just gonna get awesome lighting in software - at best u'll get dot lighting and that's really it.

that Current algo - can cull a huge amount of nodes with 1 divide - that's the real trick of it, not the delta changes that get passed down.

Your algo seems interesting but I don't quite understand at first read. I did try some sorting with a stack at first attempt but it was waay to slow. If I get time i'll have a bash at ur algo. The main reason I'm not re coding the current algo it - it's basically useless in todays tech era - the power of shaders for lighting - which the absolute most important thing nowadays, just over shadows any geometry drawing speed - even tho u can draw fast with that algo - you're just gonna get awesome lighting in software - at best u'll get dot lighting and that's really it.

Hey Kais

I went through your first algo - that seems really promising - it's like a hierachical Zbuffer - But I stuck on understanding a few things from the notation:

Frustum[XYZ] - what does the XYZ access do to the frustum?

Stack[XYZ] - again what does the XYZ access do - it that the octrees children?

and should

while (ScreenTile[XY].Used for all XY in { LU, RU, LD, RD })

actually be

while (!ScreenTile[XY].Used for all XY in { LU, RU, LD, RD }) ?

and I renamed it so it's easier to understand:

/* Octree nodes are the buckets (internal nodes have <= 8 buckets: their children);

frustum octants are the things being sorted.

The hierarchy of ScreenTile is given by a 1 bit per node complete quadtree stored in level order (hence, father-children relationships are "computable").

The hierarchy of cubes is a sparse octree.

Call bucket_sort with cube = root node and stack = (view frustum, viewport) */

bucket_sort(cube, stack)

{

if (cube is a point)

{

do {

(ScreenTile, frustum) = pop(stack)

if (!ScreenTile.Used)

{

ScreenTile.Colour = cube.Colour

do

{

ScreenTile.Used = true

if (!ScreenTile.Parent)

return

ScreenTile = ScreenTile.Parent;

}

while (ScreenTile[XY].Used for all XY in { LU, RU, LD, RD })

}

}

while (stack)

}

else

{

do

{

(ScreenTile, frustum) = pop(stack)

if (!ScreenTile.Used)

{

if (frustum in cube[XYZ])

push((ScreenTile, frustum), stack[XYZ])

else /* octosect frustum, quadrisect square */

for (XY in { LU, RU, LD, RD }) /* LU: left-up quadrant, etc. */

if (!ScreenTile[XY].Used)

for (Z in { F, B }) /* F: front, B: back */

push((ScreenTile[XY], frustum[XYZ]), stack)

}

}

while (stack)

for (XYZ in front_to_back) /* depends on the eye's octant */

if (stack[XYZ]) /* if subcube "cube[XYZ]"'s stack non-empty */

bucket_sort(cube[XYZ], stack[XYZ])

}

}

I went through your first algo - that seems really promising - it's like a hierachical Zbuffer - But I stuck on understanding a few things from the notation:

Frustum[XYZ] - what does the XYZ access do to the frustum?

Stack[XYZ] - again what does the XYZ access do - it that the octrees children?

and should

while (ScreenTile[XY].Used for all XY in { LU, RU, LD, RD })

actually be

while (!ScreenTile[XY].Used for all XY in { LU, RU, LD, RD }) ?

and I renamed it so it's easier to understand:

/* Octree nodes are the buckets (internal nodes have <= 8 buckets: their children);

frustum octants are the things being sorted.

The hierarchy of ScreenTile is given by a 1 bit per node complete quadtree stored in level order (hence, father-children relationships are "computable").

The hierarchy of cubes is a sparse octree.

Call bucket_sort with cube = root node and stack = (view frustum, viewport) */

bucket_sort(cube, stack)

{

if (cube is a point)

{

do {

(ScreenTile, frustum) = pop(stack)

if (!ScreenTile.Used)

{

ScreenTile.Colour = cube.Colour

do

{

ScreenTile.Used = true

if (!ScreenTile.Parent)

return

ScreenTile = ScreenTile.Parent;

}

while (ScreenTile[XY].Used for all XY in { LU, RU, LD, RD })

}

}

while (stack)

}

else

{

do

{

(ScreenTile, frustum) = pop(stack)

if (!ScreenTile.Used)

{

if (frustum in cube[XYZ])

push((ScreenTile, frustum), stack[XYZ])

else /* octosect frustum, quadrisect square */

for (XY in { LU, RU, LD, RD }) /* LU: left-up quadrant, etc. */

if (!ScreenTile[XY].Used)

for (Z in { F, B }) /* F: front, B: back */

push((ScreenTile[XY], frustum[XYZ]), stack)

}

}

while (stack)

for (XYZ in front_to_back) /* depends on the eye's octant */

if (stack[XYZ]) /* if subcube "cube[XYZ]"'s stack non-empty */

bucket_sort(cube[XYZ], stack[XYZ])

}

}

whoops - forgot the code tag :P

**Code:**

```
/* Octree nodes are the buckets (internal nodes have <= 8 buckets: their children);
frustum octants are the things being sorted.
The hierarchy of squares is given by a 1 bit per node complete quadtree stored in level order (hence, father-children relationships are "computable").
The hierarchy of cubes is a sparse octree.
Call bucket_sort with cube = root node and stack = (view frustum, viewport) */
bucket_sort(cube, stack)
{
if (cube is a point)
{
do {
(ScreenTile, frustum) = pop(stack)
if (!ScreenTile.Used)
{
ScreenTile.Colour = cube.Colour
do
{
ScreenTile.Used = true
if (!ScreenTile.Parent)
return
ScreenTile = ScreenTile.Parent;
}
while (ScreenTile[XY].Used for all XY in { LU, RU, LD, RD })
}
}
while (stack)
}
else
{
do
{
(ScreenTile, frustum) = pop(stack)
if (!ScreenTile.Used)
{
if (frustum in cube[XYZ])
push((ScreenTile, frustum), stack[XYZ])
else /* octosect frustum, quadrisect square */
for (XY in { LU, RU, LD, RD }) /* LU: left-up quadrant, etc. */
if (!ScreenTile[XY].Used)
for (Z in { F, B }) /* F: front, B: back */
push((ScreenTile[XY], frustum[XYZ]), stack)
}
}
while (stack)
for (XYZ in front_to_back) /* depends on the eye's octant */
if (stack[XYZ]) /* if subcube "cube[XYZ]"'s stack non-empty */
bucket_sort(cube[XYZ], stack[XYZ])
}
}
```

frustum[XYZ] represents the XYZ subfrustum of the given frustum. XYZ is LUF, RUF, LDF, RDF, LUB, RUB, LDB, RDB where L: left, R: right, U: up, D: down, F: front and B: back. So frustum[RDF] (XYZ = RDF) is the right-down-front subfrustum.

stack[XYZ] is a stack associated with the subcube XYZ (XYZ as above).

"square[XY].black for all XY in { LU, RU, LD, RD }" means "the given square's 4 subsquares are completely filled"

The idea behing this family of algorithms is to render (the visible part of) the view frustum "on" the object space instead of rendering the visible part of the view frustum on the viewplane. Unfortunately I could not make it fast.

Thanks for the interest!

stack[XYZ] is a stack associated with the subcube XYZ (XYZ as above).

"square[XY].black for all XY in { LU, RU, LD, RD }" means "the given square's 4 subsquares are completely filled"

The idea behing this family of algorithms is to render (the visible part of) the view frustum "on" the object space instead of rendering the visible part of the view frustum on the viewplane. Unfortunately I could not make it fast.

Thanks for the interest!

New algo's are always interesting!!

I still don't understand the concept of a sub frustum - can you explain that a bit further - thx?

When I was investigating the UD stuff - the slowest aspect of the algo was the traversal of nodes - the sheer number (in the millions) makes it slow - it totally whacks the cache. So the biggest speed up win is all based around how quickly you can cull nodes and it's children. The z buffer and conic normal culls nodes very quickly. Even tho you have to transform a node to 2d screen space and check it's z buffer space - say for a node occupies 50x50 pixels - you do a loop check of that z buffer area - that might seem slow - but it is far quicker when that z buffer space is filled and you exit that entire node and it's children.

Any kind of stack or sorting per node is going to be really slow - I did a few of these when trying - and the memory access for this kind of thing just adds to the overall memory bandwidth. Also any stack stuff you try and do yourself, will be slower than doing it recursively since the cpu is totally geared up for that kind of thing.

I still don't understand the concept of a sub frustum - can you explain that a bit further - thx?

When I was investigating the UD stuff - the slowest aspect of the algo was the traversal of nodes - the sheer number (in the millions) makes it slow - it totally whacks the cache. So the biggest speed up win is all based around how quickly you can cull nodes and it's children. The z buffer and conic normal culls nodes very quickly. Even tho you have to transform a node to 2d screen space and check it's z buffer space - say for a node occupies 50x50 pixels - you do a loop check of that z buffer area - that might seem slow - but it is far quicker when that z buffer space is filled and you exit that entire node and it's children.

Any kind of stack or sorting per node is going to be really slow - I did a few of these when trying - and the memory access for this kind of thing just adds to the overall memory bandwidth. Also any stack stuff you try and do yourself, will be slower than doing it recursively since the cpu is totally geared up for that kind of thing.

@Shabby you seem pretty close to have that fast culling/traversal algo behind UL :) I think this is one half of the magic there. I suspect strongly that the other half is that the scene is a tree of instanced octrees. The algorithm would work pretty much the same, and it would explain the high level of parts reuse on all UL demos scenes, and the "huge amount of data" ramblings.

@marmakoide

I agree with u 100% it's 2 trees

1) an octree for each object

2) some kind of tree for each instance - this can be a really simple affair.

The z buffer algo - allows u draw instances and they will get culled properly as long as you draw front first.

If the objects are all rotated the same - the offset deltas (might need scaling) and node sort order will be the same for every instance :)

I'm wondering why they have not shown stuff moving about and rotating - because with the zbuffer algo and instances - u can do this easily?

I agree with u 100% it's 2 trees

1) an octree for each object

2) some kind of tree for each instance - this can be a really simple affair.

The z buffer algo - allows u draw instances and they will get culled properly as long as you draw front first.

If the objects are all rotated the same - the offset deltas (might need scaling) and node sort order will be the same for every instance :)

I'm wondering why they have not shown stuff moving about and rotating - because with the zbuffer algo and instances - u can do this easily?

A subfrustum is an octant of frustum, just like a subcube is an octant of cube in octrees.

How did you store your octree? Your method is very interesting. Can you give us its pseudo-code?

How did you store your octree? Your method is very interesting. Can you give us its pseudo-code?

@Shabby True, the "instanced octrees" are always axis-aligned, and UL's approach might heavily rely on this (thus, no rotations of objects shown). Not moving around, hum, might be that some heavy precalcs are involved, some form of spatial structure like BVH/BIH/kd-tree ?

@marmakoide

Interesting you could be right with the lack of movement - because of restrictions with the second tree - you need to sort correctly else the whole thing falls apart.

@kais - no problem pseudo code was something like this - I did it a long time ago - but it was very similar to this:

Interesting you could be right with the lack of movement - because of restrictions with the second tree - you need to sort correctly else the whole thing falls apart.

@kais - no problem pseudo code was something like this - I did it a long time ago - but it was very similar to this:

**Code:**

```
void RenderNodeUnClipped(Node,WorldPos,Depth=1)
{
if (WorldPos.z>SomeScreenSizeValue)
{
ScreenPos=FastPerspectiveTransform(WorldPos);
RenderPixel(ScreenPos,Node.Colour);
}
else
{
if (ConicNormalFacingAwayFromCamera)
return;
ScreenRect=PerspectiveTransform(WorldPos)+RectSize;
if (!TestZ(ScreenRect) return;
if (Node.Child0) RenderNodeUnClipped(Node.Child0,WorldPos+Deltas[0]/Depth,Depth+1);
if (Node.Child1) RenderNodeUnClipped(Node.Child1,WorldPos+Deltas[1]/Depth,Depth+1);
if (Node.Child2) RenderNodeUnClipped(Node.Child2,WorldPos+Deltas[2]/Depth,Depth+1);
if (Node.Child3) RenderNodeUnClipped(Node.Child3,WorldPos+Deltas[3]/Depth,Depth+1);
if (Node.Child4) RenderNodeUnClipped(Node.Child4,WorldPos+Deltas[4]/Depth,Depth+1);
if (Node.Child5) RenderNodeUnClipped(Node.Child5,WorldPos+Deltas[5]/Depth,Depth+1);
if (Node.Child6) RenderNodeUnClipped(Node.Child6,WorldPos+Deltas[6]/Depth,Depth+1);
if (Node.Child7) RenderNodeUnClipped(Node.Child7,WorldPos+Deltas[7]/Depth,Depth+1);
}
void RenderNodeClipped(Node,WorldPos,Depth=1)
{
if (WorldPos.z>SomeScreenSizeValue)
{
ScreenPos=FastPerspectiveTransform(WorldPos);
RenderPixel(ScreenPos,Node.Colour);
}
else
{
if (ConicNormalFacingAwayFromCamera)
return;
ScreenRect=PerspectiveTransform(WorldPos)+RectSize;
bool DidClip;
bool OnScreen;
ClipScreenRect(ScreenRect,DidClip,Onscreen);
if (!TestZ(ScreenRect) return;
if (OnScreen)
{
if (DidClip)
{
if (Node.Child0) RenderNodeClipped(Node.Child0,WorldPos+Deltas[0]/Depth,Depth+1);
if (Node.Child1) RenderNodeClipped(Node.Child1,WorldPos+Deltas[1]/Depth,Depth+1);
if (Node.Child2) RenderNodeClipped(Node.Child2,WorldPos+Deltas[2]/Depth,Depth+1);
if (Node.Child3) RenderNodeClipped(Node.Child3,WorldPos+Deltas[3]/Depth,Depth+1);
if (Node.Child4) RenderNodeClipped(Node.Child4,WorldPos+Deltas[4]/Depth,Depth+1);
if (Node.Child5) RenderNodeClipped(Node.Child5,WorldPos+Deltas[5]/Depth,Depth+1);
if (Node.Child6) RenderNodeClipped(Node.Child6,WorldPos+Deltas[6]/Depth,Depth+1);
if (Node.Child7) RenderNodeClipped(Node.Child7,WorldPos+Deltas[7]/Depth,Depth+1);
}
else
{
if (Node.Child0) RenderNodeUnClipped(Node.Child0,WorldPos+Deltas[0]/Depth,Depth+1);
if (Node.Child1) RenderNodeUnClipped(Node.Child1,WorldPos+Deltas[1]/Depth,Depth+1);
if (Node.Child2) RenderNodeUnClipped(Node.Child2,WorldPos+Deltas[2]/Depth,Depth+1);
if (Node.Child3) RenderNodeUnClipped(Node.Child3,WorldPos+Deltas[3]/Depth,Depth+1);
if (Node.Child4) RenderNodeUnClipped(Node.Child4,WorldPos+Deltas[4]/Depth,Depth+1);
if (Node.Child5) RenderNodeUnClipped(Node.Child5,WorldPos+Deltas[5]/Depth,Depth+1);
if (Node.Child6) RenderNodeUnClipped(Node.Child6,WorldPos+Deltas[6]/Depth,Depth+1);
if (Node.Child7) RenderNodeUnClipped(Node.Child7,WorldPos+Deltas[7]/Depth,Depth+1);
}
}
}
}
```

How did you store the octree?

I think I remember trying 3 different trees - it's all very vague coz it was a long time ago.

the first one was based on a menger sponge thing I did which is simply

8 u32's pointing to the next set of eight nodes.

I think I then tried 2 other compact organizations, since the memory/cache impact was huge - one was alot quicker and quick to code and the last one I tried took me ages to code and was slower. Unfortunately I cannot remember exactly what I tried :(

the first one was based on a menger sponge thing I did which is simply

8 u32's pointing to the next set of eight nodes.

**Code:**

```
u32 Nodes[16384]={ 8, 16, 24, 32, 40, 48, 56, 64,
0x00,0x00,0x00,0xff,0x00,0xff,0xff,0xff,
0x00,0x00,0xff,0x00,0xff,0x00,0xff,0xff,
0x00,0xff,0x00,0x00,0xff,0xff,0x00,0xff,
0xff,0x00,0x00,0x00,0xff,0xff,0xff,0x00,
0x00,0xff,0xff,0xff,0x00,0x00,0x00,0xff,
0xff,0x00,0xff,0xff,0x00,0x00,0xff,0x00,
0xff,0xff,0x00,0xff,0x00,0xff,0x00,0x00,
0xff,0xff,0xff,0x00,0xff,0x00,0x00,0x00,
0xff,0xff,0xff,0xff, 80, 80, 80, 80,
0xff,0xff,0xff,0xff, 88, 88, 88, 88,
0xff,0xff,0xff,0xff, 96, 96, 96, 96,
0xff,0xff,0xff,0xff, 0, 0, 0, 0};
```

I think I then tried 2 other compact organizations, since the memory/cache impact was huge - one was alot quicker and quick to code and the last one I tried took me ages to code and was slower. Unfortunately I cannot remember exactly what I tried :(

hum... is that a *dense* octree ? then, it's begging for z-order :)

no Idea what kind of octree that is - it came from a different project - dabbling in something else :P which eventually turn into a octree renderer - and they eventually turned into a totally different project - hence no code anymore. I'm sure there are academic papers on how to store octree's for maximum size,speed and cache efficiency.

> I'm sure there are academic papers on how to store octree's for

> maximum size,speed and cache efficiency.

Linear octrees, hash table.

> maximum size,speed and cache efficiency.

Linear octrees, hash table.

from your posts I understood that you have found a way to render octrees without raycasting, right?

Shabby/Z. J. has invented an efficient non raycasting-based octree renderer. It is also quite elegant. Too bad he wants not to re-implement it. Russians have come up with various approaches (e.g., this one by JX: http://chorasimilarity.wordpress.com/2013/01/19/discussion-about-how-an-ud-algor ithm-might-work/).