Hacker News

4 years ago by raphlinus

Hi again! This post was an early exploration into GPU rendering. The work continues as piet-gpu, and, while it's not yet a complete 2D renderer, there's good progress, and also an active open source community including people from the Gio UI project. I've not long ago implemented nested clipping (which can be generalized to blend modes) and have a half-finished blog post draft. I'm also working on this as my work as a researcher on the Google Fonts team. Feel free to ask questions in this thread - I probably won't follow up with everything, as the discussion is pretty sprawling.

4 years ago by lame88

One small error (I think) - I noticed your link to pathfinder linked to someone's 2020 fork of the repository rather than the upstream servo repository.

4 years ago by littlestymaar

> someone's 2020 fork

pcwalton was the developer behind pathfinder at Mozilla (but was part of past summer's layoff).

4 years ago by Const-me

The quality is not good, and the performance is not even mentioned.

I have used different approach: https://github.com/Const-me/Vrmac#vector-graphics-engine

My version is cross-platform, tested with Direct3D 12 and GLES 3.1.

My version does not view GPUs as SIMD CPU, it actually views them as a GPU.

When rendering a square without anti-aliasing, the library will render 2 triangles. When rendering a filled square with anti-aliasing, the library will render about 10 triangles, large opaque square in the center, and thin border about 1 pixel thick around it for AA.

It uses hardware Z buffer with early Z rejection to save pixel shaders and fill rate. It uses screen-space derivatives in the pixel shader for anti-aliasing. It renders arbitrarily complex 2D scenes with only two draw calls, one front to back with opaque stuff, another back to front with translucent stuff. It does not have quality issues with stroked lines much thinner than 1px.

4 years ago by glowcoil

> The quality is not good, and the performance is not even mentioned.

I notice that your renderer doesn't even attempt to render text on the GPU and instead just blits glyphs from an atlas texture rendered with Freetype on the CPU: https://github.com/Const-me/Vrmac/blob/master/Vrmac/Draw/Sha...

In contrast, piet-gpu (the subject of the original blog post) has high enough path rendering quality (and performance) to render glyphs purely on the GPU. This makes it clear you didn't even perform a cursory investigation of the project before making a comment to dump on it and promote your own library.

4 years ago by Const-me

> instead just blits glyphs from an atlas texture rendered with Freetype on the CPU

Correct.

> has high enough path rendering quality (and performance) to render glyphs purely on the GPU

Do you have screenshots showing quality, and performance measures showing speed? Ideally from Raspberry Pi 4?

> This makes it clear you didn't even perform a cursory investigation of the project

I did, and mentioned in the docs, here’s a quote: ā€œI didn’t want to experiment with GPU-based splines. AFAIK the research is not there just yet.ā€ Verifiable because version control: https://github.com/Const-me/Vrmac/blob/bbe83b9722dcb080f1aed...

For text, I think bitmaps are better than splines. I can see how splines are cool from a naĆÆve programmer’s perspective, but practically speaking they are not good enough for the job.

Vector fonts are not resolution independent because hinting. Fonts include a bytecode of compiled programs who do that. GPUs are massively parallel vector chips, not a good fit to interpret byte code of a traditional programming language. This means whatever splines you gonna upload to GPU will only contain a single size of the font, trying to reuse for different resolution will cause artifacts.

Glyphs are small and contain lots of curves. Lots of data to store, and lots of math to render, for comparatively small count of output pixels. Copying bitmaps is very fast, modern GPUs, even low-power mobile and embedded ones, are designed to output ridiculous volume of textured triangles per second. Font face and size are more or less consistent within a given document/page/screen. Apart from synthetic tests, glyphs are reused a lot, and there’re not too many of them.

When I started the project, the very first support of compute shaders on Pi 4 was just introduced in the Mesa upstream repo. Was not yet in the official OS images. Bugs are very likely in versions 1.0 of anything at all.

Finally, even if Pi 4 had awesome support for compute shaders back them, the raw compute power of the GPU is not that impressive. Here in my Windows PC, my GPU is 30 times faster than CPU in terms of raw FP32 performance. With that kind of performance gap, you can probably make GPU splines work fast enough, after spending enough time on development. Meanwhile, on Pi 4 there’s no difference, the quad-core CPU has raw performance pretty close to the raw performance of the GPU. To lesser extent same applies to low-end PCs: I only have a fast GPU because I’m a graphics programmer, many people are happy with their Intel UHD graphics, these are not necessarily faster than CPUs.

4 years ago by glowcoil

> > This makes it clear you didn't even perform a cursory investigation of the project

> I did, and mentioned in the docs, here’s a quote: ā€œI didn’t want to experiment with GPU-based splines. AFAIK the research is not there just yet.ā€

Not what I said. I said that you didn't investigate the project discussed in the original blog post before declaring, in your words, that "the quality is not good" and comparing it to your own library.

Vrmacs and piet-gpu are two totally different types of renderer. Vrmacs draws paths by decomposing them into triangles, rendering them with the GPU rasterizer, and antialiasing edges using screen-space derivatives in the fragment shader. This approach works great for large paths, or paths without too much detail per pixel, but it isn't really able to render small text, or paths with a lot of detail per pixel, with the necessary quality. (Given this and the other factors you mentioned in your reply, rendering text on the CPU with Freetype is a perfectly reasonable engineering choice and I am not criticizing it in the slightest.)

In comparison, piet-gpu decomposes paths into line segments, clips them to pixel boundaries, and analytically computes pixel coverage values using the shoelace formula/Green's theorem, all in compute shaders. This is more similar to what Freetype itself does, and it is perfectly capable of rendering high-quality small text on the GPU, in way that Vrmacs isn't without shelling out to Freetype.

Again, to be clear, I'm not criticizing any of the design choices that went into Vrmacs; it looks like it occupies a sweet spot similar to NanoVG or Dear ImGui, where it can take good advantage of the GPU for performance while still being simple and portable. My only point here is that you performed insufficient investigation of piet-gpu before confidently making an uninformed claim about it and putting it in a somewhat nonsensical comparison with your own project.

4 years ago by skohan

Does your approach allow for quality optimizations like sub pixel rendering for arbitrary curves? It seems like this is what is interesting about this approach.

Also in terms of ā€œtwo draw callsā€, does that include drawing text as part of your transparent pass, or are you assuming that your window contents are already rendered to textures?

4 years ago by Const-me

> Does your approach allow for quality optimizations like sub pixel rendering for arbitrary curves?

The library only does grayscale AA for vector graphics.

Subpixel rendering implemented for text but comes with limitations: only works when text is not transformed (or transformed in specific ways, like rotated 180°, or horizontally flipped), and you need to pass background color behind the text to the API. Will only look good if the text is on solid color background, or on slow-changing gradient.

Sub pixel AA is hard for arbitrary backgrounds. I’m not sure many GPUs support the required blend states, and workarounds are slow.

> does that include drawing text as part of your transparent pass

Yes, that includes text and bitmaps. Here’s two HLSL shaders who do that, the GPU abstraction library I’ve picked recompiles the HLSL into GLSL or others on the fly: https://github.com/Const-me/Vrmac/blob/master/Vrmac/Draw/Sha... https://github.com/Const-me/Vrmac/blob/master/Vrmac/Draw/Sha... These shaders are compiled multiple times with different set of preprocessor macros, but I did test with all of them enabled at once.

4 years ago by fho

Dumb question ... but isn't the easiest AA method rendering at a higher resolution and downsampling the result?

I see that it's not feasible for a lot of complex 3D graphics, but 2D is (probably) a lot less taxing for modern GPUs?

4 years ago by ishitatsuyuki

piet-gpu contributor here. You're right that supersampling is the easiest way to achieve AA. However, its scalability issues are immense: for 2D rendering it's typically recommended to use 32x for a decent quality AA, but as the cost of supersampling scales linearly (actually superlinearly due to memory/register pressure), it becomes more than a magnitude slower than the baseline. So if you want to do anything that is real-time (e.g. smooth page zoom without resorting to prerendered textures which becomes blurry), supersampling is mostly an unacceptable choice.

What is more practical is some form of adaptive supersampling: a lot of pixels are filled by only one path and don't require supersampling. There's also some more heuristics that can be used: one that I want to try out in piet-gpu is to exploit the fact that in 2D graphics, most pixels are only covered by at most two paths. So as a base line we can track only two values per pixel plus a coverage mask, then in rare occasions of three or more shapes overlapping, fall back to full supersampling. This should keep the cost amplification more under control.

4 years ago by Const-me

The short answer — it depends.

What you described is called super-sampling. Supersampling is indeed not terribly hard to implement, the problem is performance overhead. Many parts of graphics pipeline scale linearly with count of pixels. If you render at 16x16 upscaled resolution, that gonna result in 256x more pixel shader invocations, and 256x the fill rate.

There’s a good middle ground called MSAA https://en.wikipedia.org/wiki/Multisample_anti-aliasing In practice, 16x MSAA often delivers very good results for both 3D and 2D. In case of 2D, even low-end PC GPUs are fast enough with 8x or 16x MSAA level.

Initial versions of my library used that method.

The problem was, Raspberry Pi 4 GPU is way slower than PC GPUs. The performance with 4x or 2x MSAA was too low for 1920x1080 resolution, even just for 2D. Maybe the problem is actual hardware, maybe it’s a performance bug in Linux kernel or GPU drivers, I have no idea. I didn’t want to mess with the kernel, I wanted a library that works fast on officially supported 32-bit Debian Linux. That’s why I bothered to implement my own method for antialiasing.

4 years ago by esperent

> Maybe the problem is actual hardware

I think it is - as far as I know most modern GPUd implement MSAA at the hardware level, and that's why even a mobile GPU can handle. 8x MSAA at 1080p.

I don't know anything about the Raspberry Pi GPU, but maybe you'd have better results switching to FXAA or SMAA there (by which I mean faster, not visually better).

4 years ago by dahart

High quality supersampling is about equally hard in 2D and 3D, since the result is 2D. It is also the most common solution in both 2D and 3D graphics, so your instinct is reasonably good.

But, font and path rendering, for example in web browsers and/or with PDF or SVG - these things can benefit in both efficiency and in quality from using analytic antialiasing methods. 2D vector rendering is a place where doing something harder has real payoffs.

Just a fun aside - not all supersampling algorithms are equally good. If you use the wrong filter, it's can be very surprising to discover that there are ways you can take a million samples per pixel or more and never succeed in getting rid of aliasing artifacts. (An example is if you just average the samples, aka use a Box Filter.) I have a 2D digital art project to render mathematical functions that can have arbitrarily high frequencies. I spent money making large format prints of them, so I care a lot about getting rid of aliasing problems. I've ended up with a Gaussian filter, which is a tad blurrier than experts tend to like, because everything else ends up giving me visible aliasing somewhere.

4 years ago by ghusbands

Aside to the aside - you might find that applying a sharpening filter after the Gaussian gives a good result, as it won't reintroduce every kind of aliasing, and can reduce the appearance of blurriness.

4 years ago by undefined

[deleted]

4 years ago by davrosthedalek

If you render two anti-aliased boxes next to each other (i.e. they share an edge), will you see a thin line there, or a solid fill? Last time I checked, cairo-based PDF readers get this wrong, for example.

4 years ago by Const-me

Good question.

If you use the fillRectangle method https://github.com/Const-me/Vrmac/blob/1.2/Vrmac/Draw/iDrawC... to draw them I think you should get solid fill. That particular API doesn’t use AA for that shape. Modern GPU hardware with their rasterization rules https://docs.microsoft.com/en-us/windows/win32/direct3d11/d3... is good at that use case, otherwise 3D meshes would contain holes between triangles.

If you render them as 2 distinct paths, filled, not stroked, and anti-aliased, you indeed gonna see a thin line between them. Currently, my AA method shrinks filled paths by about 0.5 pixels. For stroked paths it’s the opposite BTW, the output is inflated by half of the stroke width (the mid.point of the strokes correspond to the source geometry).

You can merge boxes into a single path with 2 figures https://github.com/Const-me/Vrmac/blob/1.2/Vrmac/Draw/Path/P..., in this case C++ code of the library should collapse the redundant inner edge and the output will be identical to a single box, i.e. solid fill. Will also render slightly faster because less triangles in the mesh.

4 years ago by moonchild

Something else worth looking at: the slug font renderer[0]. Sadly it's patented, but the paper[1] is there for those of you in the EU.

0. http://sluglibrary.com/

1. http://jcgt.org/published/0006/02/02/paper.pdf

4 years ago by Lichtso

In 2005 Loop & Blinn [0] found a method to decide if a sample / pixel is inside or outside a bezier curve (independently of other samples, thus possible in a fragment shader) using only a few multiplications and one subtraction per sample.

    - Integral quadratic curve: One multiplication
    - Rational quadratic curve: Two multiplications
    - Integral cubic curve: Three multiplications
    - Rational cubic curve: Four multiplications

[0] https://www.microsoft.com/en-us/research/wp-content/uploads/...

4 years ago by vg_head

It's referenced in slug's algorithm description paper [1], the main disadvantage with Loop-Blinn is the triangulation step that is required, and at small text sizes you lose a bit of performance. Slug only needs to render a quad for each glyph. That is not to say that any one method is better than the other though! They both have advantages and disadvantages. I think the two most advanced techniques for rendering vector graphics on the GPU are "Massively Parallel Vector Graphics" [2] and "Efficient GPU Path Rendering Using Scanline Rasterization" [3]. Though I don't know of any well known usage of them. Maybe it's because it's very hard to implement them, the sources attached to them are not trivial to understand, even if you've read the papers. They also use OpenCL/Cuda if I remember correctly.

[1] "GPU-Centered Font Rendering Directly from Glyph Outlines" http://jcgt.org/published/0006/02/02/

[2] http://w3.impa.br/~diego/projects/GanEtAl14/

[3] http://kunzhou.net/zjugaps/pathrendering/

EDIT: I've only now seen that [2] and [3] are already mentioned in the article

EDIT2: To compensate for my ignorance, I will add that one of the authors of MPVG has a course on rendering vector graphics: http://w3.impa.br/~diego/teaching/vg/

4 years ago by Lichtso

If I understand correctly the second link is basically an extension of Loop-Blinns implicit curve approach with vector textures in order to find the winding counter for each fragment in one pass.

>> Slug only needs to render a quad for each glyph.

I don't know how many glyphs you want to render (to the point that there are so many that you can't read them anymore), but a modern GPU s are heavily optimized for triangle throughput. So 2 or 20 triangles per glyph makes only a little difference. The bigger problem is usually the sample fill rate and memory bandwidth (especially if you have to write to pixels more than once).

I have been eying the scanline-intersection-sort approach (your third link) too. Sadly they have no answer to path stroking (same as everybody else) and it also requires an efficient sorting algorithm for the GPU (implementations of such are hard to come by outside of CUDA, as you mentioned).

4 years ago by korijn

Any alternative solutions for the problem of GPU text rendering (that are not patent infringing)?

4 years ago by djmips

A signed distance field approach can be good depending on what you're after. https://github.com/libgdx/libgdx/wiki/Distance-field-fonts

4 years ago by onion2k

There's a great WebGL library for doing that on the web using any .ttf, .otf, or .woff font - https://github.com/protectwise/troika/tree/master/packages/t...

4 years ago by pvidler

You can always render the text to a texture offline as a signed distance field and just draw out quads as needed at render time. This will always be faster than drawing from the curves, and rendering from an SDF (especially multi-channel variants) scales surprisingly well if you choose the texture/glyph size well.

A little more info:

https://blog.mapbox.com/drawing-text-with-signed-distance-fi...

MIT-licensed open-source multi-channel glyph generation:

https://github.com/Chlumsky/msdfgen

The only remaining issue would be the kerning/layout, which is admittedly far from simple.

4 years ago by srgpqt

FOSS does not magically circumvent patents.

4 years ago by orhmeh09

Is there a serious risk of patent enforcement in common open source repositories ranging from GitHub to PPAs and Linux package repositories located outside any relevant jurisdictions?

4 years ago by korijn

Does that imply it's possible to implement 2D font/vector graphics rendering on a GPU and end up getting burned by patent law? I am having a hard time imagining they were awarded such a generic patent.

Anyway, I will adjust my question based on your feedback.

4 years ago by Jasper_

Slug isn't great for lots of shapes since it does the winding order scanning per-pixel on the pixel shader. It does have a novel quadratic root-finder. Put simply, it's better suited to fonts than large vector graphics.

4 years ago by vg_head

I've once implemented the basic idea behind the algorithm used in slug(described in the paper [1], though without the 'band' optimization, I just wanted to see how it works), and I agree with you, the real innovation is in that quadratic root-finder. It can tell you whether you are inside or outside just by manipulating the three control points of a curve, it's very fast, what remains to be done is to use an acceleration data structure so that you don't have to check for every curve. That works very well for quadratic BƩzier curves, in the paper it says that it can be easily extended to cubics, though no example is provided(and I doubt it's trivial). What I think would be hard with Slug's method is extending it to draw gradients, shadows, basically general vector graphics like you say. Eric Lengyel on his twitter showed a demo [2] using Slug to render general vector graphics, but I'm not sure of how many features it supports, but it definitely supports cubic BƩzier curves. I'd also like to add that the algorithm didn't impress me with how the text looks at small sizes, which I think is very important in general, though maybe not so much for games(maybe I just didn't implement it correctly).

[1] "GPU-Centered Font Rendering Directly from Glyph Outlines" http://jcgt.org/published/0006/02/02/

[2] https://twitter.com/EricLengyel/status/1190045334791057408

4 years ago by ink_13

This would have been a lot better with examples in the form of rendered images or perhaps even a video. Maybe it's just my lack of background in graphics, but I had a lot of trouble grasping what the author was attempting to communicate without a concrete example.

4 years ago by eternalban

It's about taking known 2D graphics and UI approaches which were developed for CPUs and looking at effective rendering engine architectures doing the same using GPUs. Terms such as "scene graph", "retained mode UI", etc. are those existing 2d graphics matter.

So the approach, afaiu, is a data layout for the scene graph that basically is the more domain general concern of mapping graph e.g. Linked List, datastructures (that are CPU friendly) to array forms (GPU friendly) suitable for parallel treatment. Other GPU concerns, such as minimizing global traffic by local caching, and mapping thread groups to tiles. I found the idea of having the scene graph resident in GPU to be interesting.

(note to author: "serialization" comes from networking roots of serializing a data structure for transmision over the net. So, definitely serial. /g)

4 years ago by Agentlien

I understand where you are coming from. There is a lot of jargon and it assumes familiarity with many concepts. I think any explanatory images which would help someone unfamiliar with the field would need to be accompanied by quite a bit of explanation.

One thing which I think made reading this a bit more work than necessary is that it feels like it's prattling on about a lot of tangential details and never quite gets to the point.

edit: a prime example of an unnecessary aside is mentioning the warp/wavefront/subgroup terminology. I feel anyone in the target audience should know this already and it's not really relevant to what's being explained.

4 years ago by skohan

It doesn't seem to be a finished work. I guess this is more of a journal entry on the author's initial takeaways from a week-long coding retreat.

4 years ago by MattRix

It wouldn’t have been better, it just would have been more high level and generalized, but I don’t think that’s what the author was going for. I found the amount of detail refreshing, and as someone about to make a GPU based 2D display tree renderer, it was written at just the right level to be quite useful.

4 years ago by danybittel

It would be fantastic if something like this were part of the modern apis. Vulkan, Metal DX12. But I guess it's not as sexy as raytracing.

4 years ago by raphlinus

I think the world is going into a more layered approach, where it's the job of the driver API (Vulkan etc) to expose the power of the hardware fairly directly, and it's the job of higher levels to express the actual rendering in terms of those lower level primitives. Raytracing is an exception because you do need hardware support to do it well.

Whether there's hardware that can make 2D rendering work better is an intriguing question. The NV path rendering stuff (mentioned elsethread) was an attempt (though I think it may be more driver/API than hardware), but I believe my research direction is better, in that it will be higher quality, faster, and more flexible with respect to the imaging model on standard compute shaders than an approach using the NV path rendering primitives. Obviously I have to back that up with empirical measurements, which is not yet done :)

4 years ago by moonchild

Nvidia tried to make it happen[0].

Sadly, it didn't catch on.

0. https://developer.nvidia.com/nv-path-rendering

4 years ago by slimsag

It's still there on all NVIDIA GPUs as an extension, just nobody uses it.

IMO it didn't catch on because all three of these points:

1. It only works on NVIDIA GPUs, and is riddled with joint patents from NVIDIA and Microsoft forbidding anyone like AMD or Intel from supporting it.

2. It's hard to use: you need to get your data into a format it can consume, usage is non-trivial, and often video game artists are already working with rasterized textures anyway so it's easy to omit.

3. Vector graphics suck for artists. The number of graphic designers I have met (who are the most likely subset of artists to work with vector graphics) that simply hate or do not understand the cubic bezier curve control points in Adobe, Inkscape, and other tools is incredible.

4 years ago by derefr

> It only works on NVIDIA GPUs, and is riddled with joint patents from NVIDIA and Microsoft forbidding anyone like AMD or Intel from supporting it.

Why do companies do this? What do they expect to get out of creating an API that is proprietary to their specific non-monopoly device, and that therefore very obviously nobody will ever actually use?

4 years ago by redisman

> Vector graphics suck for artists

Someone mentioned Flash in this thread and that was a very approachable vector graphics tool. I don't know how many games translate to the vector style though - it's almost invariably a cartoonish look. The tools are very geometric so it just kind of nudges you towards that. Pixels these days are more like painting so it's no surprise artists like that workflow (they all secretly want to be painting painters).

4 years ago by Const-me

> just nobody uses it

AFAIK Skia and derived works (Chrome, Chromium, Electron, etc.), are all using it when available.

4 years ago by Jasper_

Stencil-and-cover approaches like NV_path_rendering have a lot of drawbacks like documented below, but probably biggest of all, they're still mostly just doing the tessellation on CPU. A lot of the important things, like winding mode calculations, are handled on the CPU. Modern research is looking for ways out of that.

4 years ago by Lichtso

Actually, they calculate the winding per fragment on the GPU [0]. They require polygon tessellation only for stroking (which has no winding rule). The downside of their approach is that it is memory bandwidth limited, precisely because it does the winding on the GPU instead of using CPU tessellation to avoid overlap / overdraw.

Curve filling is pretty much solved with implicit curves, stencil-and-cover or scanline-intersection-sort approaches (all of which can be done in a GPU only fashion). Stroking is where things could still improve a lot as it is almost always approximated by polygons.

[0]: https://developer.nvidia.com/gpu-accelerated-path-rendering

4 years ago by klaussilveira

Loosely related: Blend2D has been innovating a lot on this space.

https://blend2d.com/

4 years ago by slmjkdbtl

always appreciate raph's work on rendering and UI programming, but want to ask a somewhat unrelated question to this post: does anyone have a lot of experience doing 2d graphics on the cpu? i wonder if there'll be a day we're confident doing all 2d stuff on the cpu since cpus are much easier to work with and have much more control, i also read some old 3d games are also using software rendering and did well on old hardwares, that gave me a lot of confidence in software render every (lowrez) thing

4 years ago by Jasper_

Yes? We know how to write scanline renderers for 2D graphics. They're not that hard, a simple one can be done in ~100 lines of code or so, see my article here https://magcius.github.io/xplain/article/rast1.html

4 years ago by slmjkdbtl

Tanks for the amazing article! I wonder if you met any performance annoyances / bottlenecks when doing actual GUI / game dev with this?

4 years ago by iainmerrick

You just need to render your font on the CPU once, and upload mipmaps to the GPU (either signed distance fields, or just sharpened textures, that works absolutely fine too).

I think all this GPU font rendering stuff is a bit of a red herring. Are any popular apps or games actually making heavy use of it?

4 years ago by olau

Enjoyable article, thanks!

4 years ago by bumbada

I have lots of experience, created several font renderers in the CPU and GPU.

No, doing CPU drawing is too inefficient. Without too much trouble you get 50x more efficiency in the GPU, that is, you can draw 50frames per every frame in the CPU, using the same amount of energy and time.

If you go deeper, low level with Vulkan-Metal and specially control the GPU memory, you can get 200x, being way harder to program.

CPU drawing is very useful for testing: You create the reference you compare the GPU stuff with.

CPU drawing is the past, not the future.

4 years ago by slmjkdbtl

Thanks for the write up. Yeah i can see the huge performance difference, one thing that GPU bothers me is now every vendor provides a different set of API, you kinda have to use a hardware abstraction layer to use the GPU if you really want cross platform, and that's often a huge effort or dependency and hard to do right, even in OpenGL days it's easy because you only have to deal with one API instead of three vulkan/metal/d3d. With CPU if ignore the performance lack it's just a plain pixel array that can be easily displayed on any environment and you have control over any bits of it, I just can't get over the lightness and elegance difference between the two..

4 years ago by bumbada

Vulkan/metal/d3d give you control to do things you could not do with OpenGL, and they are very similar. All my code works first in Vulkan and Metal, but d3d is not hard when you have your vulkan code working.

OpenGL was designed by committee and politics(like not giving you the option to compile shaders for a long time while d3d could do it).

The hard part is thinking in term extreme parallelism. That is super hard.

Once you have something working in the GPU, you can translate it to electronics like using FPGAs.

The difference in efficiency is so big, that most GPU approaches do not really work. They are really approximations that fail with certain glyphs. Designers create a model that works with most glyphs, and sometimes they have a fallback, inefficient method for them.

With the CPU you can calculate the area under a pixel exactly.

4 years ago by fulafel

The future is now: on many systems there's fairly little GPU acceleration going on when runing eg a web browser and things work fine.

4 years ago by choxi

I spent a couple years learning graphics programming to build an iPad app for creating subtle animated effects. The idea was kind of like Procreate but if you had "animated brushes" that produced glimmer or other types of looping animated effects.

From what I've read, the technique behind most digital brushes is to render overlapping "stamps" over the stroke. They're spaced closely enough that you can't actually see the stamps.

But if you want to animate the stamps, you either have to store the stroke data as a very large sequence of stamp meshes or you can only work with the data in a raster format. The former is way too many meshes even with instancing, and the latter loses a lot of useful information about the stroke. Imagine you wanted to create a brush where the edges of the stroke kind of pulsate like a laser beam, you ideally want to store that stroke data in a vector format to make it easier to identify e.g. centers and edges.

But it turned out to be too challenging for me to figure out how to 1) build a vector representation of a stroke/path without losing some of the control over detail you get with the stamping technique and 2) efficiently render those vectors on the GPU.

I'm not sure if this would help with the issues I ran into, but I'm definitely excited to see some focus on 2D rendering improvements!

Daily digest email

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.