2D dragging as an interface for 3D rotation

So when I implemented my primitive 3D “renderer”, something I had to think about was how to implement drag rotation. Prior to writing the code, and prior to even thinking about using 3D models for visualising eigenvectors, I had pondered about how drag rotation should work.

In three dimensions, if you specify two unit vectors (normalise them if they’re not unit vectors), there are infinitely many rotations from the first vector to the second, because you can take any such rotation then left-compose it with a rotation around the second axis to get another such rotation. However, in general, there is a unique shortest rotation that follows a great circle path. (It’s not unique exactly when it’s an antipodal rotation.) The axis of this rotation is the cross product of the vectors and the angle around the axis to rotate is the arccosine of the dot product. The resulting rotation can be represented in multiple ways, such as axis-angle, Euler angles, a 3×3 matrix, a quaternion, etc.

In the quaternion case, the above operation of generating a rotation quaternion from two vectors can be done without using any trigonometric functions at all, which is an optimisation that’s apparently used a lot in video games and such. I won’t go into detail about this.

The very first idea I had for 2D dragging was something like this. Say you have some kind of surface z=f(x,y) lying above whatever you’re rendering; then a drag from (x_0,y_0) to (x_1,y_1) would be treated as the shortest rotation from (x_0,y_0,f(x_0,y_0)) to (x_1,y_1,f(x_1,y_1)) after appropriate normalisation. I was being intellectually lazy about what kind of surface to use when making my visualisation demo so I just picked a flat surface f:(x,y)\mapsto1, which seems to work reasonably well/intuitively with my face-transitive models.

Or is the “intuitiveness” a side effect of conditioning because everyone else also uses the same idea to convert 2D drags to 3D rotations, or is it just that the surface used doesn’t really matter all that much? Who knows.

One peculiar result of using the above dragging convention (at least with a flat surface) is that dragging in a circle will rotate the model in the same direction as the dragging. This might or might not be useful, but it’s interesting to note nonetheless.

After this was implemented in my renderer, I added an alternative convention where the rotation was location-independent; a drag from (x,y) to (x+\Delta x,y+\Delta y) would produce the same rotation as a drag from (0,0) to (\Delta x,\Delta y), or, more symmetrically, from (-\Delta x/2,-\Delta y/2) to (\Delta x/2,\Delta y/2). The rotation is then generated by the two vectors (-\Delta x/2,-\Delta y/2,z) and (\Delta x/2,\Delta y/2,v) where z is some constant. (Note that since these two vectors are reflections along the z-axis, their cross product must be perpendicular to the z-axis and hence lie on the xy-plane. This applies to the rotation axis too, since it’s just a scalar multiple of the cross product.)

Unlike the first convention, you can effect an arbitrarily large rotation in a single drag with this convention simply by dragging in one direction far enough; with the first convention, dragging along a straight line passing through the origin results in a net rotation by only π.

This drag→rotation convention also leads to the even stranger result that dragging in a circle will make the model rotate in the opposite direction. Or, I guess it’s not all that strange if you try to think about what dragging in a circle is really doing.

The above seems to suggest that we should be able to interpolate between the two presented conventions so that dragging in a circle leads to zero net rotation. Given the two 2D vectors (x_0,y_0) and (x_1,y_1), define the following vectors, where z is the height of our surface to project the cursor onto.

\displaystyle \mathbf m:=(\tfrac{x_0+x_1}2,\tfrac{y_0+y_1}2)\\  \mathbf d:=(\tfrac{x_1-x_0}2,\tfrac{y_1-y_0}2)

Further define \mathrm{quat}:\mathbb R^3\times\mathbb R^3\to\mathbb H as a function producing a quaternion out of two vectors, normalising them as necessary. The result is undefined if the vectors are degenerate (one of them is zero, they’re parallel/antiparallel, etc.). As an abuse of notation, we’ll use this function on a pair of 2D vectors, in which case you should assume the z-coordinate is 1.

The two conventions are then \mathrm{quat}(\mathbf m-\mathbf d,\mathbf m+\mathbf d) and \mathrm{quat}(-\mathbf d,\mathbf d), from which we see that the obvious way to interpolate this is to use \mathrm{quat}(c\mathbf m-\mathbf d,c\mathbf m+\mathbf d) for some value of c, and indeed, using c=1/2 seems to result in zero net rotation when dragging in circles, or, in other words, the resulting rotation depends only on where you start/stop the drag and is path-independent.

How would one go about proving that this is really path-independent? Quaternion multiplication is not commutative, so we can’t just convert an infinite composition of infinitesimal rotations into an integral of something by taking the logarithm of the associated quaternions.

BTW sandpile, part deux

Because I have way too much free time[citation needed], I spent a couple of hours yesterday Wednesday afternoon optimising my BTW sandpile code to exploit the D4 symmetry inherent in starting with a single “stack” of sand.

All I have to say is that dealing with edge cases is annoying and contributed to the bulk of those hours. It was necessary to duplicate (ugh) the code to handle edge cases separately instead of within the main loop because branch prediction seems to suck in JS. This improved the speed by about 30%.

It runs about four times as fast as the old code; less than the eightfold speedup I expected, but still quite an improvement nonetheless.

[BTW sandpile, 800000 grains]

Here, have an 800000-grain run that took 199.3 seconds. The corresponding 400000-grain run took 50.5 seconds, which fits my heuristic guess of \Theta(S^2) complexity a bit too well. By extrapolation, a 28-million-grain run would take about 1225 times as long, i.e. about 68 hours. Multithreading and SIMD ought to be able to bring this down by a factor of maybe 5, though neither of those are trivial to implement.

Update: here’s a 7-million-grain run, in greyscale and with the R/G/B palette. Note the qualitative similarity with the one on Wikipedia; maybe the sandpile will tend to some fractal-y structure? Run-time logs have also been uploaded.

Update 2: arXiv:1105.0111 [math.AP]



I’ve been running a web server on my Linux-running laptop for years now. (Not as in the uptime, but for how long ago I set it up.) As I’ve mentioned before, this laptop’s gone through multiple Linux installations, most of which I installed nginx on, and since nginx configs are somewhat forwards-compatible, I didn’t have much trouble just copying the config from one install to another. The web server has only ever served static content, which probably helped with that.

Anyway, I had it set up so that roughly the same configs were listening to ports 80 (HTTP) and 443 (HTTPS), but the HTTP server required password authentication. There wasn’t really any particular reason for doing this that I can recall; maybe it was so that only I could connect without encryption. (Back then I had my whole hard disk autoindexed, which is as terrible an idea as it sounds. It’s slightly less dumb with encryption and authentication though…) That said, it’s occasionally amusing to look at my nginx access logs and see people trying to hit port 80 only to get blocked by auth. Also, broken websites that pointed shit to localhost (mostly due to borked DNS) produced HTTP authentication dialogue boxes.

The funny thing is, the last time I reinstalled Linux was 2012, which also means the last time I copied the config from another install was that long ago too. I only just found out I somehow never copied the .htpasswd over and authentication was entirely impossible. I guess it didn’t really make a difference since I never gave the password out and I hardly used it myself.


Man, having to take English modules even in university sucks big-time.

I knew this was coming, and I still can’t take it. It’s really unfortunate that it just so happens that I have a Wednesday night cartoon to help out in fansubbing and the English classes are on Monday and Thursday mornings.

Writing is something I’m not efficient at, and my habit of doing things at the last minute hasn’t gone away, so now I’m pondering over the various amounts of doom I’ll experience tomorrow if I don’t manage to get shit done within about two hours 1.5 hours. (See, that’s how long it took to write this blog post.)

Holy shit, I am doomed.

Last semester I only really got into trouble with the last PC1141 lab summary which I rushed out the morning of the day it was due. (I somehow got an A for that, but I think that was just sympathy marks, lol. I put in considerably more effort into the earlier lab summaries and only got B/C for those.) Homework for the other modules I was taking wasn’t too hard: Multivariable Calculus was easy, Fundamental Concepts of Mathematics was challenging but doable, Physics I was also challenging but I got help. I guess it worked to my favour that I took only three modules that required doing work outside of the lectures last semester.

But this semester I’m taking six modules, all five of which have homework of some sort. And Mathematical Analysis I has weekly assignments that take over six hours to finish. This is absolutely ridiculous, I tell you. (Hello, 5 MC should be a weekly workload of 12.5 hours, not— actually I guess five hours of lectures/tutorials and six hours of homework doing makes only 11 hours. But still!)

It’s good practice, sure, which actually reminds me of the fact that while I have a fairly solid understanding of the “easy” maths, I haven’t really explored much of higher mathematics, so the laidback attitude I’ve had throughout high school and last semester will probably betray me at some point. (In fact, it already did: I messed up the Multivariable Calculus exam. But that’s a story for another day. I’d also say I botched the Physics I exam but seeing how I got an A overall, other people must’ve done even worse. Thanks, bell curve.)


Given any polynomial P, P(P(x))-x is always divisible by P(x)-x.

Proof: Factor out P(x)-x into its zeroes. These values are the fixed points of P, and hence also P\circ P, so these values divide P(P(x))-x.

So, uh, I realise five years later that this actually doesn’t really hold water. It works when there are no repeated roots, of course, but what if there are? Then this just means that P(P(x))-x has the same roots as P(x)-x, not that the former has root multiplicity at least as large as the latter (which is what we need to prove).

The proof is easy to mend anyway. Suppose x_0 is some root of P(x)-x with multiplicity m (where m\ge1). We can express P(x)-x as a polynomial in terms of x-x_0; the leading nonzero term (with terms in order of ascending exponent) is thus a_m(x-x_0)^m. To simplify the equations a bit, let y:=x-x_0. Evaluating P(P(x))-x then gives

\displaystyle \begin{array}{lcl}P(P(x))-x&=&P(x+a_my^m+O(y^{m+1}))-x\\  &=&a_m(y+a_my^m+O(y^{m+1}))^m+O((y+a_my^m+O(y^{m+1}))^{m+1})\\  &=&a_m(y^m+my^{m-1}a_my^m+O(y^{2m}))+O(y^{m+1}+my^ma_my^m+O(y^{2m+1}))\\  &=&a_my^m+O(y^{2m-1})+O(y^{m+1})\\  &=&a_my^m+O(y^{\min\{m+1,2m-1\}})  \end{array}

Therefore P(P(x))-x has x_0 as a root with multiplicity at least m. In particular, if m\ge2, then the multiplicity is exactly m.

Easy mode: prove that (P(x)-x)|(P^n(x)-x) for all nonnegative integers n, where the exponent refers to function iteration.

Hard mode: do that without induction. (The above argument adapts very straightforwardly to the general case, but also requires induction on n.)

Rational approximations to fractional powers of 2

We have the Maclaurin series for \exp(x\log2), but it also turns out that \log2 is irrational so we can’t use that if we want to stay in the realm of rational numbers.

(Proof that \log2 exists at all is left as an exercise for the reader.)

We could use some rational approximation to \log2, like by truncating some infinite series at a finite number of terms, but that is distinctly not fun because this value needs to be fed into the Maclaurin series for the exponential function and then suddenly all the terms become fractions with ginormous numerators and denominators.

Consider the following instead. We have the following Padé approximant of e^x.

\displaystyle e^x=\frac{1+x/2}{1-x/2}+\Theta(x^3)

We can do something similar for 2^x.

\displaystyle 2^x\approx\frac{1+ax}{1-ax}

The natural choice for a here seems to be \tfrac12\log2, which recovers the above Padé approximant, but that leads to irrationality. Instead, we can solve for a value of a such that equality holds when x=1. That gives a=1/3, or, simplifying a bit,

\displaystyle 2^x\approx\frac{3+x}{3-x}.

In case you ever need to find out how much the frequency differs between two notes a semitone apart, but you don’t have a calculator handy, you don’t have 21/12 memorised and you don’t remember all the random ratios in music theory, you can now resort to this trick to quickly get the approximation 37/35. Handy, ain’t it? Of course, it also works for any whole number of semitones separating two notes. This provides a considerably more accurate approximation than just linearly interpolating 2^x between x=0 and x=1. (However, these approximations are not necessarily best rational approximations and with the exception of 2^{1/3}\approx5/4, have odd numerators and denominators, which may be less convenient to work with.)

As a sanity check, substituting x=1/2 gives \sqrt2\approx7/5, which is a continued fraction convergent of \sqrt2, and taking the derivative at x=0 gives \log2\approx2/3, which is also a continued fraction convergent.

In the above, we chose to solve for a\in\mathbb Q such that 2^x=\tfrac{1+ax}{1-ax} held for x=1. What other choices for x could we have chosen here? x and 2^x should both be rational, which leaves us with just the integers. Since it doesn’t really make sense to use the approximation when |x|\ge1 (because then we have the not-approximation 2^x=2\cdot2^{x-1}), we should just choose the smallest integer possible. Equality already holds for x=0 regardless of a so taking x=0 is meaningless, which leaves x=1 as the next candidate.

The reason we’re not considering negative values of x above is due to symmetry. To wit, for a function f(x):=P(x)/P(-x) (where P is some other function), we have f(-x)=1/f(x), so if the equality holds for x, it holds for -x too.

So far we’ve taken P to be an affine function; extending it to higher-degree polynomials (solving for coefficients that make f(n)=2^n for as many consecutive integers as possible) yields the following successively better approximations.

\displaystyle P_1(x)=3+x\\  P_2(x)=26+9x+x^2\\  P_3(x)=378+131x+18x^2+x^3\\  P_4(x)=7704+2670x+395x+30x^3+x^4\\  P_5(x)=201960+69994x+10755x^2+925x^3+45x^4+x^5\\  P_6(x)=6472080+2243052x+352744x^2+32445x^3+1855x^4+63x^5+x^6

Do these converge to the Maclaurin series of 2^x (modulo normalisation) as the degree tends to infinity? What’s the asymptotic error bound like for these approximations? Is there a closed form for the coefficients?

Empirically speaking, it seems that, considering only the case of equidistant samples, rational function interpolation is a lot more stable and accurate than polynomial interpolation. With polynomial interpolation, it’s generally the case that the function is quite well approximated near the middle samples, but has unbounded error or converges slower at the boundaries; these rational interpolants also have error increasing towards the boundaries, but seem to diverge from 2^x more slowly. (This is discounting the fact that the rational interpolants have random poles that make the function not necessarily defined over all the reals. In fact, P_1(-3)=P_3(-7)=P_5(-11)=P_7(-15)=P_9(-19)=0.)

As mentioned above, as a sanity check, we can see what value P_k(1/2)/P_k(-1/2) takes for various values of k.

\displaystyle \frac{P_1(1/2)}{P_1(-1/2)}=\frac75=[1;2,2]\\  \frac{P_2(1/2)}{P_2(-1/2)}=\frac{41}{29}=[1;2,2,2,2]\\  \frac{P_3(1/2)}{P_3(-1/2)}=\frac{239}{169}=[1;2,2,2,2,2,2]\\  \frac{P_4(1/2)}{P_4(-1/2)}=\frac{1393}{985}=[1;2,2,2,2,2,2,2,2]\\  \frac{P_5(1/2)}{P_5(-1/2)}=\frac{8119}{5741}=[1;2,2,2,2,2,2,2,2,2,2]\\  \frac{P_6(1/2)}{P_6(-1/2)}=\frac{47321}{33461}=[1;2,2,2,2,2,2,2,2,2,2,2,2]

Perhaps not coincidentally, these are the third, fifth, seventh, ninth, eleventh and thirteenth convergents of \sqrt2, respectively. If this pattern extends forever, then P_k(1/2)/P_k(-1/2) converges linearly with error \Theta((1+\sqrt2)^{-2k}).

2(\log P_1)'(0)=2/3=[0;1,2]\\  2(\log P_2)'(0)=9/13=[0;1,2,3]\\  2(\log P_3)'(0)=131/189=[0;1,2,3,1,6,2]\\  2(\log P_4)'(0)=445/642=[0;1,2,3,1,6,3,1,1]\\  2(\log P_5)'(0)=34997/50490=[0;1,2,3,1,6,3,1,1,2,1,2,2,3]\\  2(\log P_6)'(0)=62307/89890=[0;1,2,3,1,6,3,1,1,2,1,1,1,1,2,3]

These are not all convergents of \log2, but they do seem to converge linearly to \log2 with error something like 34^{-k}. It’d be interesting to find out if there’s any meaningful pattern. The difference (\log P_{k+1})'(0)-(\log P_k)'(0) is a positive unit fraction for each of the values presented above; does this provide an Egyptian fraction decomposition of \tfrac12\log2?

Another eta series

Recall this:


It turns out that we also have:


It’s time for that again.



Term ratios −1/4 and −1/27, respectively; obtained from two different (but alike) transformations of the following series.


Both of the transformed series have initial term 19/6=3.166666… I wonder if there’s any noncontrived (or “naturally occurring”) series for π with an initial term of 22/7.


Just one quick minor observation regarding the discrete-space heat equation over hypercubic lattices.

In the one-dimensional case, the Green’s function is given by the aptly-named discrete Gaussian, with DTFT \exp t(\cos\omega-1). (Note that the “discrete-time” in “discrete-time Fourier transform” is referring to the spatial dimension here, not the time dimension.)

In n\ge1 dimensions, this generalises to \exp t(\sum_{k=0}^{n-1}\cos\omega_k-n), but this is exactly equal to the tensor product of one-dimensional discrete Gaussians \prod_{k=0}^{n-1}\exp t(\cos\omega_k-1). In other words, the Green’s function for the discrete-space heat equation in higher dimensions is separable; specifically, it separates into n copies of the 1D discrete Gaussian.

This was something I took for granted before, and while it’s not difficult to prove, it wasn’t immediately obvious either. (My gut feel was telling me that it would possess approximate rotational symmetry, which is contradicted by this finding because the tensor product of the discrete Gaussian does not have rotational symmetry.)

Incidentally, looking at a discretised heat equation for various graphs is linked to all the graph spectra I was fooling around with a week ago; the smallest eigenvalue of a graph’s adjacency matrix is related to how coarsely you can approximate the continuous-time heat equation with a discrete-time system.

Edit: what I meant by the lack of rotational symmetry was wrt how quickly the functions decay as the radius increases; under rescaling and interpolation, the discrete Gaussian uniformly converges to the continuous Gaussian as the variance tends to infinity, just as the central limit theorem tells us.

Despite having written a post about the discrete Gaussian before, now I’m not so sure using a discrete Gaussian is relevant at all, even if it wasn’t a hassle to compute compared to the sampled Gaussian. Is the lack of rotational symmetry (in the above sense) a deal breaker? Bah.


In 2014, this blog had a total of 1034 views, according to WordPress stats. (This probably excludes most of my own views, since I’m usually logged in.)

Funny thing is, my not-very-secret blog about certain weeaboo-related activities got 2561 views. It’s “secret” only insofar that I’ve never previously mentioned it or even its existence here. (Naturally, I don’t mention the existence of this one on that one.)

So, what makes this difference?

This very blog is kinda random (entropic, even) and contains loads of bullshit most people (sometimes even myself) wouldn’t be interested in, and I’ve never bothered advertising it in any form. (The closest thing to advertising was posting a screenshot (i.e. you have to type the URL out) on my Twitter feed years ago.) As of now, there probably are about two regular readers. I got a link from /r/math to a post with hardly any substantial content, which is kind of embarrassing. In fact, there hardly ever is any substantial content here, mathematics or otherwise; there were only about two posts here in recent memory that I’d be proud of. (Why do I continue writing here anyway? :iiam:)

The other blog is very niche, and I post on it far less frequently (104 here versus 27 there last year), but it’s somehow found its way around. It helps that it’s been pasted a couple of times in an IRC channel with plenty of people who might find the content interesting (possibly to be disappointed!), and I don’t put as much personal/sensitive information there that would make me afraid of being linked to.

I’m not sure I care about the view statistics either positively or negatively. On one hand, it’s nice if I get lots of views. Hey, popularity is cool. On the other hand, it’s really embarrassing to have lots of people read my inane ramblings. Having a readership also puts some pressure on me to actually produce meaningful content (no offence to the readers here), with which I have plenty of difficulty. Like I said above, there were two recent-ish posts on this blog I consider informative among a hundred essentially-garbage posts.

This here blog does not have any limitation in scope. Anything goes! Rather, anything I don’t mind publicly posting with my RL identity attached goes. The other blog is limited in scope to video encoding and such, and I take some precautions to leaking my RL identity. There’s a small overlap in the list of things I could post either under my RL identity (here) or my “IRC identity” (there), and this is a bit of a conundrum because writing about the same topic twice is stupid, but copying a post from one blog to the other verbatim is even stupider.

Side effects

One unfortunate side effect of abandoning my Twitter account is that I can’t provide a link list for 2014 like I did for 2013.

Then again, I spent the most part of my NEETdom wasting my life away, so the first half of 2014 wouldn’t have contributed much. Besides, the handful of people who followed my Twitter account probably didn’t care too much about the shit I linked, so maybe that’s a plus for them.

I haven’t abandoned Twitter entirely, just the logging-in aspect of it. It’s considerably easier to waste less time on Twitter when reading updates involves more than typing T-down-alt+enter. (Yet I still keep Hexchat open in a secondary workspace and switch to it every so often, which takes only three key presses, all clustered in a corner of the keyboard. Todo: measure how often I do this and surprise myself with the statistics.)

Adjacency matrix eigenvectors

I uploaded the polyhedron visualisation thing to GH. This needs a sufficiently recent browser; works out-of-the-box for Firefox 37, but Chrome will need enabling experimental JavaScript in chrome://flags.

Anyway, the first three models (the cube and the two dodecahedra) were just for testing purposes and aren’t too interesting on their own. The “(rhombic) dodecahedron α/β/γ k” models correspond to the eigenvectors of the adjacency matrix of the graph of the dual polyhedra of the regular dodecahedron and the rhombic dodecahedron.

>The … to the … of the … of the … of the … of the … and the …

Let’s back up a bit.

Say we’re looking at a cycle graph, which has a nice 2D embedding as a (regular) polygon, or a 1D embedding as a line which wraps at the ends. This has the following adjacency matrix, which is almost tridiagonal except for those two antidiagonal entries. (The important things to note are that this is an adjacency matrix, that adjacency matrices of undirected graphs are symmetric, and that symmetric matrices are diagonalisable.)

\displaystyle\begin{pmatrix}0&1&0&&0&0&1\\  1&0&1&\cdots&0&0&0\\  0&1&0&&0&0&0\\  &\vdots&&\ddots&&\vdots&\\  0&0&0&&0&1&0\\  0&0&0&\cdots&1&0&1\\  1&0&0&&0&1&0\end{pmatrix}

Because of the relation to a cyclic convolution, spectral analysis here is equivalent to a discrete Fourier transform—ergo, the eigenvalues other than ±2 have multiplicity two (because of symmetry), and one choice of eigenvectors is exactly the DFT basis. (−2 is an eigenvalue iff the cycle graph has even order.) Real symmetric matrices can be diagonalised by a real orthogonal matrices, and it’s fairly trivial to rotate a DFT basis to an orthogonal set of real vectors.

I was thinking of generalising the concept of a convolution with a symmetric kernel to more vertex-transitive graphs in general, since I happened to have a d12 and an origami rhombic dodecahedron lying about. (They were around stacks of Rubik’s Cubes of various sizes, but that merely desensitised me to cubic objects.) So, I wondered, is there a natural generalisation of the DFT to the faces of these solids? (In other words, to the vertices of their duals.) Is it as simple as sampling spherical harmonics?

I don’t have an answer to that yet, but it’s probably “no”. (Research pending.)

That said, I used NumPy to calculate a set of eigenvectors for the corresponding adjacency matrix for both the regular dodecahedron and the rhombic dodecahedron. (The eigenvectors are not unique, and NumPy gave orthogonal eigenvectors (which are not unique either).) It did tell me the multiplicities of the eigenvalues, but the eigenvectors weren’t readable as is, so I bumbled around for a while before deciding to write the 3D renderer linked at the top of this post.

There’s the “trivial” all-ones eigenvector for both the dodecahedron and rhombic dodecahedron; these have no models because why would you need a visualisation for that. α, β, γ refer to the eigenvalues in increasing order, and there’re as many eigenvectors as the eigenvalue multiplicity for each eigenvalue due to symmetry. The eigenvalue multiplicities for both dodecahedra are 1, 3, 3 and 5. 3 and 5 appearing for the regular dodecahedron might not be unexpected, but it’s a bit surprising that 5 would show up for the rhombic dodecahedron, with a symmetry group order (48) indivisible by five.

The whole idea of generalising convolution was inspired by a paper about optimising dartboards. A generalisation to dice of various shapes was what I was thinking of, even though dice are considerably harder to manipulate than darts. (And of course darts are manipulable; otherwise darts would be a game entirely based on chance.) Maybe it’s more applicable to rigged dice.

More 3DPD

[screenshot: a rhombic dodecahedron with reverse perspective]

Switching between “normal” perspective projection and reverse perspective was as simple as negating the depth constant, so I ended up playing around with it a fair amount. It’s quite… disorienting. (The use of semitransparent polygons didn’t help, because then it sorta felt like I was looking at the insides of the solid with the “really visible” faces semitransparent.)

Oh, and I got around to implementing dragging to rotate, which made playing with it even more fun. Inertia would be nice to have too, eventually; currently it just uses exponential decay, which feels a bit like inertia.

There’s one caveat with very fast rotations. The rotation state is stored as a unit quaternion, and the unit quaternions form a double cover of SO(3) (for any quaternion, both it and its negative correspond to the same SO(3) rotation), so simply taking a great circle path from the initial to the final quaternions will sometimes result in a rotation the “long way”. This cannot happen with a single drag rotation the way I implemented it (which fixes the z coordinate), but resetting the view is one way to trigger this.

Anyway, one feature of reverse perspective is that it allows you to see more faces than would be possible with linear projection. (Regular perspective, on the other hand, can obscure faces.) In the above picture, there are nine visible faces of a rhombic dodecahedron, though the maximum with linear or perspective projection is just six. The maximum number of visible faces of a rhombic dodecahedron with reverse perspective is eleven, but this requires a smaller depth than what I used for the screenshot above.


[screenshot: cube rendered with 2D canvas]

I just spent the last few hours wrapping my head around trying to write an extremely simplistic “3D” renderer with perspective projection. This is done entirely within a 2D canvas context, without any fancy schmancy WebGL or whatever. The 3D coordinates are simply projected onto two dimensions and then painted in some order.

This is actually not the right thing to do, since it entirely disregards z-order. (The above screenshot uses semitransparent polygons.) For convex polyhedra, painting only the faces with normals pointing towards the camera is probably correct. I don’t particularly care about rendering concave polyhedra at the moment (this was meant to be a demo/visualisation for something unrelated to 3D rendering), but it’s clear that this approach wouldn’t work in general.


“Pointing towards the camera” literally means pointing towards the camera, and not “having a positive z component”. It somehow took me quite a while to figure that out. Incidentally, there’s one way to generalise the concept of a surface normal to work for a tuple of vertices which do not necessarily all lie on the same plane. We can triangulate this shape, weight the surface normals of the resulting triangles by their areas (this is practically free if we use a cross product), then add these up. Say we choose one vertex and connect all the others to it for our triangulation. This gives our surface normal as

\displaystyle\mathbf v:=\sum_{k=1}^{n-2}(\mathbf a_k-\mathbf a_0)\times(\mathbf a_{k+1}-\mathbf a_0),

which simplifies to

\displaystyle \mathbf v=\sum_{k=0}^{n-1}\mathbf a_k\times\mathbf a_{k+1},

with the understanding that \mathbf a_n:=\mathbf a_0. This has cyclic symmetry among all the vertices, so it turns out to not matter which vertex we choose initially. Furthermore, it can be proven with induction that our choice of triangulation doesn’t matter either, and will give the same result. In the two-dimensional case, this simplifies to the shoelace area algorithm (but without the 1/2 scaling factor).

Anyway, that turned out to not be very useful with perspective projection because it’d still require a dot product with the line of sight in order to ascertain direction, and since I already needed to project all the points to two dimensions anyway, it’s faster to just use the shoelace algorithm.