Things I did
Optimize all the things
I did two rounds of optimizations of the code, so python part is faster.
Each blade of grass is a cone, and previously each one of them would result in code similar to:
cone {
// Coordinates:
<0.11751987765853868, 0.33079267650261673, 0.09734413318477975>
1e-05
<0.11751987765853868, 0.331318350241197, 0.09734413318477975>
1e-08
// Color:
texture {
pigment {
rgb<0.30000000000000016, 0.3000000000000006, 0.9999999999999999>
}
finish {
ambient "White"
}
}
no_shadow
}
This is problematic for two reasons:
- Code that converts Python objects to povray file is a known bottleneck, a and is a non-trivially removable bottleneck as I heavily rely on reflection.
- Povray files grow large(ish), and reading and writing them takes time.
Good thing is that povray allows me to somewhat abbreviate redundancy in the above definition. I can do it by leveraging #declare directive.
This goes roughly like that:
#declare O1 = cone {
<0.0, -0.0008597359127325056, 0.0>
4.090781197508472e-06
<0.0, 0.0008597359127325056, 0.0>
2.4545469110087494e-06
texture {
finish { ambient White }
pigment { Red }
// ...
}
object {
01
translate <1, 2, 3>
}
object {
01
translate <3, 4, 5>
}
So first I #declare an object, then I can instantiate it two times referencing it by name, and only specifying final transformation that places it at the desired place.
This cut file size roughly three times, and cut rendering time for image (rendering means: both preparing the .pov image and running povray) in half.
In the project I decided to define 300 separate grass blades, and then instantiate them as many times as I need.
Second optimization, was only in Python. I knew (by knew I mean I measured) that code that formats the python objects is a bottleneck, so I decided to simplify it.
Instead of doing:
for _ in range(grass_count):
yield TransformedObject(
referenced_object = grass_generator.choice(cone_names),
transformations=[
Translate(x, y, z)
]
)
I decided to explicitly format string:
@attr.s(auto_attribs=True, frozen=True)
class TransformedObject(TransformedObjectProps, WithTransformation):
@classmethod
def render_object_with_translation(cls, name: str, x: Num, y: Num, z: Num):
return f"""
object {{
{name}
translate <{x}, {y}, {z}>
}}
"""
and then:
for __ in range(self.count):
position_x, position_z = self.field(grass_generator)
positions = self.height_field.get_coordinates_indices(
position_x, position_z, 0
)
if positions is None:
continue
yield TransformedObject.render_object_with_translation(
grass_generator.choice(cone_names), *positions
)
This ignores whole code formatting machinery. This gave me more than 10% of performance boost.
Attempt to use pypy
I tried to have the code run faster by using pypy, and it didn't work.
I suspect the reason is that I use multiple calls to numpy code, which are not vectorized, namely:
- Getting the y position for "ground" (i.e. height_field object);
- Generating random numbers one by one;
Good thing is that trying pypy (and downgrading python from 3.8 and 3.6 was almost painless). I just needed to add dataclass backport module, and apart from that all my non-dev dependencies worked like charm.
Grass rendering
If you thing that I did this optimization for this program to be faster, you are wrong! I did it so I can add more objects :).
So there are following changes to how grass is rendered:
Previously each blade of grass was a cone that was standing upright.
Right now each blade of grass is a flat triangle (or very flattened cone) that:
- Is more flat;
- Is rotated around the y axis (y axis goes up);
- Is tilted somewhat around x and z axes.
I have also added more grass (roughly an order of magnitude more).
If I were to add the above without optimizations, file size would baloon out of proportion.
Technically I added folowing code to each cone representing the blade of grass:
// Flatten it
scale <0.0001, 1.0, 1.0>
// Rotate along the y axis
rotate <0.0, 112.72518228678813, 0.0>
// Rotate along the x and y axis
rotate <-5.191745887469165, 0.0, -10.305586857373825>
Far away grass patches
Far away parts of the image look "bare" or just boring, so I wanted to add far-away grass so it looks more "real". I got very mixed results:
- Small blades of grass turned out to be near invisible;
- When I added larger blades of grass in the distance, sometimes they ended near the camera.
I'll probably remove "far-away" grass fields in in the devlopement renderings, but leave them in production builds as they do add some flavour.
Misc
- Reworked the transformation code, so I can actually can specify transformation ordering (transformations are not commutative --- that is order of transformations does matter);
- Fixed all mypy errors, and removed some mypy # type: ignore comments. (I use mypy to to check for bugs and inconsitencies in type interfaces);
- Started printing file size and runtime;
- Played with povray -Q<int> switch, I did not find noticeable speed change when changing quality, so I bolted it to -Q9 which is best quality without using radiosity (radiosity is way povray models ambient lightning, that is light reflected from other objects).
- Optimized height_field (object representing the "ground") so getting
height at given position is somewhat faster, by:
- Moving as much of validation to constructor;
- Running the code with PYTHONOPTIMIZE which removes asserts;
- Added some smoke tests, that is: tests that run the generation pipeline and then check if A) nothing throws an exception B) non-empty png file shows up.
Results
I'll add more images on Mastodon if you are interested.
Plans
Texture the height field
I can't put much more grass into the files, not without loosing too much development velocity (I'd like to see updated results after couple of minutes not hours).
However distant parts of the landscape look, well bare.
So the idea is to generate very big texture file, that would simulate patches of grass using just color, and hope nobody notices in the distance ;)
Add some more objects
Add some more objects e.g. stones (big and small), or trees.
Make better grass
Use splines? Or maybe something wonky.