Worrying about things I shouldn’t really even be worrying about

Yet another thinking-out-loud post here.

Because of reasons I missed out on two assignments for that English module that contributed to 25% of the total grade, or a shade over a third of the non-exam portion of the grade. Which kind of sucks, I guess, but hey, it already happened.

Over the past three weeks I was having this gigantic dilemma over whether I should just give up and drop the module (it doesn’t affect CAP anyway) or whether I should persevere, which is kind of a difficult decision.

Pros of dropping it:

  • I don’t have to worry about whether my chances of passing are reasonably high. This is a rather significant problem, because as it is the only way of passing is to get decent marks on the remaining CA part of the grade and score incredibly well on the final exam. On the other hand, it’s not too unthinkable that I might actually do well on the exam. (Ergo, dilemma.)
  • I don’t have to spend more hours this semester researching for things I cannot care less about. Seriously, how can I be expected to care about social media when I hardly use it?
  • I don’t have to come up with excuses for why I forfeited my grades for the two aforementioned assignments. This is the most immediate concern since the next class is in about ten hours.

Cons:

  • I’ll have to retake it next year.
  • In which case, I’d probably be the only second-year taking it. Might be embarrassing?
  • With the exception of Geometry, all the other non-English modules I’m taking this semester are easymode. Next semester will most likely be different (including non-maths modules), and if I actually have to study things, I will have even less time to work on English assignments. One option is to take only 17 MC of modules next semester, but that sounds kinda suboptimal.
  • Sunk cost makes me feel bad even if it’s a complete fallacy and I know it. Which makes me feel even worse. (Though if I fail to pass this time round, the amount of sunk cost will be even higher.)
  • Even if social media is an awful topic to write about, I’m not sure next semester’s would be any better. Judging from the past semesters, almost every topic relates to real people in some way (last semester’s was about globalisation). I said I didn’t care about social media, but I care about globalisation even less, and chances aren’t high I’ll get a topic I can blog about anyhow. The plus side of social media as the topic is that at least it relates to the Internet and (I like to believe) I know the Internet somewhat well.

Hmm.

All signs point to persevering as the better option. This really is starting to look like a “no matter what happens, I lose” situation though.

Rational approximations to the binary logarithm

Sorta a follow-up to the post on rational approximations for binary exponentiation.

Maybe your first instinct would be to try to invert the rational functions in the earlier post. How well does that work?

For the first-order approximation f_1(x)=(3+x)/(3-x), we have the inverse

\displaystyle f_1^{-1}(x)=\frac{3x-3}{x+1}.

For the second-order approximation f_2(x)=(26+9x+x^2)/(26-9x+x^2), we have the inverse

\displaystyle f_2^{-1}(x)=\frac{9(x+1)-\sqrt{-23x^2+370x-23}}{2(x-1)}.

Oh, whoops, this isn’t a rational function. In general, rational functions are not injective, and it can also be shown that the only rational functions that have rational inverses are the nondegenerate Möbius transformations, so we can’t just proceed with inverting our earlier functions. Or if you decide to proceed anyway…

[screenshot: 378+131x+18x^2+x^3=y*(378-131x+18x^2-x^3)]

Yeah, uh, I get the feeling this is not going in quite the right direction.

Let’s scrap the above and start with rational functions, interpolating over exact powers of 2, where the numerator and denominator have equal degree. We want analogues of the criteria we had for the exponential approximation; for the exponential approximation we had f(x)f(-x)=1, so for our log approximation we should have g(x)+g(1/x)=0. This forces some sort of symmetry over inversion.

It also forces symmetry in the coefficients of the rational interpolant, though in this case the symmetry is not between the numerator and denominator, but within the coefficients of each. The above forces one of the numerator and denominator to be palindromic and the other to be antipalindromic, and since we have g(1)=0, the numerator must be antipalindromic (and therefore the denominator must be palindromic). This cuts down the number of coefficients we have to solve by a factor of two, which is always cool.

For the (1,1)-order interpolant, which has three degrees of freedom, we choose (\frac12,-1),(1,0),(2,1) as the points to interpolate, and by the above optimisation, we only need to solve for (2,1).

\displaystyle g_1(x):=\frac{ax-a}{x+1}\\  g_1(2)=\frac a3=1\implies a=3\\  g_1(x)=3\cdot\frac{x-1}{x+1}

Unsurprisingly, this is exactly the same as f_1^{-1}.

The (2,2)-order interpolant has five degrees of freedom, so we extend our choice of points to include (\frac14,-2),(4,2), though again we need only consider (2,1) and (4,2).

\displaystyle g_2(x):=\frac{ax^2-a}{x^2+bx+1}\\  g_2(2)=1\implies 3a-2b=5\\  g_2(4)=2\implies 15a-8b=34\\  a=\frac{14}3,\,b=\frac92\\  g_2(x)=\frac{28}3\cdot\frac{x^2-1}{2x^2+9x+2}

The (3,3)-order interpolant may be derived likewise.

\displaystyle g_3(x):=\frac{ax^3+bx^2-bx-a}{x^3+cx^2+cx+1}\\  g_3(2)=1,\,g_3(4)=2,\,g_3(8)=3\implies a=\frac{125}{21},\,b=\frac{245}{12},\,c=\frac{49}4\\  g_3(x)=\frac5{21}\cdot\frac{100x^3+343x^2-343x-100}{4x^3+49x^2+49x+4}

Once again, rational interpolants provide a much better fit than polynomials, especially in this case where there is an essential singularity at x=0. A polynomial would go wild trying to fit that, and the only polynomials that satisfy the functional equation p(x)+p(1/x)=0 are the constant polynomials, which are not very interesting.

I may or may not expand on this post at a later date. Consider this a WIP.

MA2108S Tutorial 7

1 (comparison)

\displaystyle0<\frac{a_n}{1+na_n^2}<\frac1{n^2}\\  \implies\sum_{n\ge1}\frac{a_n}{1+na_n^2}\text{ converges}

2 (comparison/condensation)

Suppose n\mapsto x_n is bounded. As a positive bounded increasing sequence, it has a positive limit x. Consequently, there exists some K such that \forall n\ge K\quad x_n>x/2. Suppose m\ge K.

\displaystyle1-\frac{x_m}{x_{m+1}}=\frac{x_{m+1}-x_m}{x_{m+1}}<\frac{x_{m+1}-x_m}{x/2}=\frac2x\cdot(x_{m+1}-x_m)

Then telescoping series etc. blah blah blah.

Suppose n\mapsto x_n is unbounded. We basically want to use something like Cauchy’s condensation test here, but we don’t know exactly how quickly or slowly this sequence diverges. (The “worst case” for divergence would be something like the harmonic series, which the condensation test handles readily, but we can’t assume our sequence diverges logarithmically.)

Construct a subsequence recursively from indices i:k\mapsto i_k where i_1=1 and x_{i_{k+1}}>2x_{i_k}.

\displaystyle\sum_{n=i_k}^{i_{k+1}-1}\left(1-\frac{x_n}{x_{n+1}}\right)=\sum_{n=i_k}^{i_{k+1}-1}\frac{x_{n+1}-x_n}{x_{n+1}}>\sum_{n=i_k}^{i_{k+1}-1}\frac{x_{n+1}-x_n}{x_{i_{k+1}}}=\frac{x_{i_{k+1}}-x_{i_k}}{x_{i_{k+1}}}>\frac12

Therefore divergence.

3 (Stolz’s)

\displaystyle\frac1n\sum_{k\le n}ku_k=\frac1n\sum_{k\le n}(k-n)u_k+\sum_{k\le n}u_k

By Stolz’s theorem, we have the following:

\displaystyle\lim_{n\to\infty}\frac1n\sum_{k\le n}(k-n)u_k=-\lim_{n\to\infty}\sum_{k\le n}u_k=-\sum_ku_k

Therefore,

\displaystyle\lim_{n\to\infty}\frac1n\sum_{k\le n}ku_k=-\sum_ku_k+\sum_ku_k=0

4 (AM-GM inequality; comparison)

Cf. Tutorial 6.

5 (Raabe’s)

Let l':=\tfrac{1+l}2.

Suppose l>1. For sufficiently large n (we also need n>(l-l')/(ll')),

\displaystyle n\log\frac{u_n}{u_{n+1}}>l'\frac n{n-l'}\\  \frac{u_n}{u_{n+1}}>e^{l'/(n-l')}>1+\frac{l'}{n-l'}\\  \frac{u_{n+1}}{u_n}<1-\frac{l'}n.

This implies convergence of \sum u_n.

Suppose l<1. For sufficiently large n (we also need n>|l'|),

\displaystyle n\log\frac{u_n}{u_{n+1}}<l'\\  \frac{u_n}{u_{n+1}}<e^{l'/n}\le\frac1{1-\frac{l'}n}\quad\left(\text{if }\left|\frac{l'}n\right|<1\right)\\  \frac{u_{n+1}}{u_n}>1-\frac{l'}n.

This implies divergence of \sum u_n.

6 (telescoping)

\displaystyle a_n\prod_{k\le n}\frac1{1+a_k}=\prod_{k\le n-1}\frac1{1+a_k}-\prod_{k\le n}\frac1{1+a_k}
\displaystyle\sum_{n=1}^\infty a_n\prod_{k\le n}\frac1{1+a_k}=1-\prod_{k=1}^\infty\frac1{1+a_k}

As the series is increasing and bounded above by 1, it converges.

7 (application of 6)

Let \alpha:=\liminf_{n\to\infty}(\frac{u_na_n}{u_{n+1}}-a_{n+1}). For sufficiently large n\ge n_0, \frac{u_na_n}{u_{n+1}}-a_{n+1}>\alpha/2. WLOG let n_0:=1.

\displaystyle\frac{u_na_n}{u_{n+1}}>a_{n+1}+\frac\alpha2\\  \frac{u_n}{u_{n+1}}>\frac{2a_{n+1}+\alpha}{2a_n}\\  \frac{u_{n+1}}{u_n}>\frac{2a_n}{2a_{n+1}+\alpha}\\  u_n<u_1\frac{2a_1}{2a_2+\alpha}\cdots\frac{2a_{n-1}}{2a_n+\alpha}\\  =2u_1a_1\frac{2a_2}{2a_2+\alpha}\cdots\frac{2a_{n-1}}{2a_{n-1}+\alpha}\frac1{2a_n+\alpha}\\  =\frac{2u_1a_1}\alpha\frac1{1+\alpha/(2a_2)}\cdots\frac1{1+\alpha/(2a_{n-1})}\frac{\alpha/(2a_n)}{1+\alpha/(2a_n)}

Then apply 6.

8 (harmonic series comparison)

This is very similar to Raabe’s test, but has somewhat weaker conditions. Let \alpha:=\liminf_{n\to\infty}n(\frac{b_n}{b_{n+1}}-1). (The question uses the limit, but we can weaken this to just the inferior limit.)

\displaystyle\exists K\ge\alpha\quad\forall n\ge K\quad n\left(\frac{b_n}{b_{n+1}}-1\right)>\frac\alpha2\\  \frac{b_n}{b_{n+1}}>1+\frac\alpha{2n}\\  \frac{b_{n+1}}{b_n}<1-\frac\alpha{2n+\alpha}\le1-\frac\alpha{3n}\le\exp\left(\frac\alpha{3n}\right)\\  b_n<b_K\exp\left(-\frac\alpha3\left(\frac1K+\cdots+\frac1{n-1}\right)\right)\to0

By the squeeze theorem, b_n\to0, and so \sum_{n\ge1}(-1)^nb_n converges.

Note: this is faulty reasoning.

9 (root)

Root test: for sufficiently large n, a_n^{1/n}<\frac{1+q}2<1.

10 (p-series)

\displaystyle-\log a_n\ge (1+\alpha)\log n\implies a_n\le\frac1{n^{1+\alpha}}\implies\sum_{n\ge1}a_n\text{ converges}

Chase

Remember four years ago when I floated the idea of negative masses?

There’re some “obvious” things that can be said about a system with two point objects of nonzero masses m and -m. As previously mentioned, if the system is initially stationary, the two objects will start moving in tandem, accelerating in the direction of the positive-mass object. The reason for this is that the gravitational fields of the two objects are equal in magnitude but opposite in sign; the negative-mass object falls towards the positive-mass object with exactly the same acceleration as the positive-mass object’s falling away from the negative-mass object.

What is perhaps slightly less obvious is that the objects will experience the same acceleration even if the objects do not initially have the same velocity; in other words, there is a (noninertial) reference frame where both objects move in straight lines at constant velocity. There is no way for such a system to have a closed orbit in any nonrotating reference frame.

On the other hand, in general, if we have two masses m_1,m_2 where m_1+m_2\ne0, there will be a circular orbit where both objects revolve around their barycentre. As the barycentre’s location involves dividing by m_1+m_2, this gives another reason as to why in our earlier system (where m_1+m_2=0) there cannot be a circular orbit.

MA2108S Tutorial 6

1: Apply Stolz’s theorem.

\displaystyle\lim_{n\to\infty}{a_1+2a_2+\cdots+na_n\over n^2}=\lim_{n\to\infty}{(n+1)a_{n+1}\over2n+1}=\lim_{n\to\infty}{n+1\over2n+1}a_{n+1}={1\over2}\alpha

2: WLOG suppose a:=0 and b:=1. Then let \Sigma:=\{(-1/2,1),(0,1/2)\}. (I’m not sure there’s much more to this question.)

3: It’s obvious that the series of the finite differences is bounded termwise by q^n and hence is absolutely convergent.

4: Take the logarithm (justified since every term is positive) to get the equivalent problem (where a_n=\log s_n\in\mathbb R and the corresponding limits are in \mathbb R\cup\{\pm\infty\}):

\displaystyle\liminf_{n\to\infty}(a_{n+1}-a_n)\le\liminf_{n\to\infty}\frac{a_n}n\le\limsup_{n\to\infty}\frac{a_n}n\le\limsup_{n\to\infty}(a_{n+1}-a_n)

The middle inequality is trivial, and the first and third inequalities are conjugates under sign change, so it will suffice to prove just the third. If \limsup_{n\to\infty}(a_{n+1}-a_n)=\infty we are done. Otherwise, there is a finite largest limit point of (a_{n+1}-a_n). Let \alpha:=\limsup_{n\to\infty}(a_{n+1}-a_n)<\infty. Then given arbitrary \epsilon>0, there exists K\in\mathbb N such that for n\ge K, a_{n+1}-a_n<\alpha+\epsilon.

For n\ge2K, we have {a_n=(a_n-a_{n-1})+\cdots+(a_{K+1}-a_K)+a_K<(n-K)\alpha+(n-K)\epsilon+a_K<n\alpha+n\epsilon+O(1)}, so a_n/n<\alpha+\epsilon+O(1/n) and thus \limsup_{n\to\infty}a_n/n\le\alpha=\limsup_{n\to\infty}(a_{n+1}-a_n).

5: Let b\in[0,1]. ?? Holy crap, this is so obvious but how the heck do you prove this. Contradiction?

Suppose b is not a cluster point; then there is an \varepsilon-neighbourhood and K\in\mathbb N such that for all n\ge K a_n\notin V_\varepsilon(b).

Actually it seems easier to prove equidistribution and have this as an immediate consequence. Or not!

We can construct a decreasing subsequence of n\mapsto a_n recursively (like the gcd algorithm presented before), although as it is we still have the problem of showing that this construction leads to a subsequence that converges to zero. If we cheat and use rounding instead of flooring, we get a contractive sequence with factor 1/2, which obviously converges; there can be a negative sign in the terms, so this is not necessarily a subsequence, but it can be made into one. (The reason this will never be zero is because every term is irrational, by definition.)

After which, we can just do quotient-remainder where the remainder is bounded by arbitrary small values. QED.

6: \displaystyle\tan\left(\frac\pi4+\frac1n\right)=\frac{\left(\tan\frac\pi4+\tan\frac1n\right)}{1-\left(\tan\frac\pi4\right)\left(\tan\frac1n\right)}=\frac{1+\tan\frac1n}{1-\tan\frac1n}

\displaystyle\left(1\pm\tan\frac1n\right)^n=\left(1\pm\tan\frac1n\right)^{\cot\frac1n\cdot n\tan\frac1n}\to \exp\left(\pm\lim_{n\to\infty}{n\tan\frac1n}\right)=e^{\pm1}

\displaystyle\lim_{n\to\infty}\left(\frac{1+\tan\frac1n}{1-\tan\frac1n}\right)^n=\frac{e}{1/e}=e^2

7: Compare with \zeta(3/2). (P-series + comparison test.)

8: First one follows from AM-GM inequality and the second one follows from the first. Third: take b_n:=1/n and apply the first to show that it is absolutely convergent (and hence convergent).

9: False. Counterexample: harmonic series.

10: a_n=(n-1)\alpha+a_1

\displaystyle\frac{1/a_{n+1}}{1/a_n}=1-\frac\alpha{n\alpha+a_1}=1-\frac1n+\frac{a_1}{n(n\alpha+a_1)}

\displaystyle\frac{a_1}{n(n\alpha+a_1)}n^2\to\frac{a_1}\alpha\implies\frac{a_1}{n(n\alpha+a_1)}n^2\text{ bounded}

By Gauss’s test, this series diverges.

√10

Um, hello, why is everyone celebrating Pi Day instead of a more culturally significant number like, you know, 10?

Or its square root, at least. That’s gotta be half as important, and considering how important 10 is, √10 must rank pretty high up too. In the last post, we showed that nonnegative numbers have square roots, and 10 is clearly nonnegative, so √10 really exists. It would be rather embarrassing were we to start deifying some imaginary entity!

We’ll be slightly sacrilegious and approximate 10 with 9, because 9 is a nice number. Why is 9 a nice number? Well, duh, of all the numbers, its English spelling is the closest to the word “nice”.

But that aside, we also have 32 = 9. Thus:

\displaystyle\sqrt{10}=\sqrt9\sqrt{10\over9}=3\left(9\over10\right)^{-1/2}=3\left(1-{1\over10}\right)^{-1/2}=3\sum_{k=0}^\infty\binom{4k}{2k}{1\over40^k}

This is a series where the terms (and consequently, the partial sums) all happen to be terminating decimals and it converges at just about one digit per term, though on average each term has two more decimal places when written out in full. On the other hand, we can also write it as follows.

\displaystyle\sqrt{10}=\sum_{k=0}^\infty{(2k-1)!!\over(2k)!!}{3\over10^k}

Which would allow us to convert this into a spigot algorithm without even having to use a bignum library.

var A = []; // output digits (base 100000)
var B = [0]; // input digits (variadic base)
var n = 1e4; // number of decimal digits to generate
var i, j, q;
for (i = 1; i <= n; i++) B[i] = 3*(2*i-1); // initialisation
for (i = 0; i < n/5; i++)
{
	for (j = n, q = 0; j > 0; j--) // q: carry from next digit
	{
		B[j] = B[j]*1e5 + q*(2*j-1); // multiply digit and add carry
		q = (B[j] / ((2*j)*10))|0; // calculate carry
		B[j] -= q * ((2*j)*10); // subtract carried amount
	}
	A[i] = q; // carry for last digit ~ integer part (most of the time)
}
while (i --> 0)
{
	if (A[i] >= 1e5) // overflow in the extracted digits (happens very rarely)
	{
		A[i] -= 1e5; // subtract
		A[i-1]++; // and carry
	}
	A[i] = (1e5 + A[i] + '').substring(1); // convert to 0-padded string
}
print('3.' + A.join('')); // concatenate and prepend a "3."

Or if you like it compact (but still entirely valid strict JavaScript with no implicit semicolons):

for(var A=[],B=[0],n=1e4,i=0,j,q,s='';i++<n;)B[i]=6*i-3;for(i=0;i<n/5;A[i++]=q)for(j=n,q=0;j;B[j]-=20*(q=B[j]/20/j|0)*j--)B[j]=B[j]*1e5+2*q*j-q;for(;i-->0;s=(1e5+A[i]+'').slice(1)+s)A[i-1]+=A[i]>=1e5;print('3.'+s);

Previously.

Addendum:

How rarely is “very rarely”? I didn’t bother doing the calculation before, but I got bored one day and worked out the corresponding series. The overflow happens with “probability” \sqrt{10}/3+3/\sqrt{10}-2\approx0.0028.

While the above code is based on Rabinowitz and Wagon’s algorithm, there is a difference in where the denominators are handled. Theirs does it in the carry, mine does it in the digit. (I only recently realised I interpreted their algorithm wrong, after having implemented my version quite a few times in the past few years.) Of course, as the underlying hypergeometric series is the same, convergence isn’t affected much. It’s just that, if I’d implemented the above with Rabinowitz and Wagon’s version, the overflow would happen with the considerably higher “probability” \sqrt{10}/27\approx0.12.

Why the scare quotes? Because this calculation has an assumption that every legal digit is equally likely when considering powers of 100000 multiplied by \sqrt{10} of digits in our variadic base, which is actually a pretty strong statement we have a good reason to believe is true, but have no proof of. (We don’t even have a proof that such a digital equidistribution property holds in a usual integer base, which is also a weaker form of normality.)

Nonnegative numbers have square roots

Claim 0: 0 has a nonnegative square root.
Proof: 02 = 0.

Claim 1: 1 has a nonnegative square root.
Proof: 12 = 1.

Claim 2: A nonnegative number a with a nonnegative (or nonpositive) square root x has a nonpositive (or nonnegative, resp.) square root −x.
Proof: (−x)2=x2=a; x ≥ 0 ⇒ −x ≤ 0; and x ≤ 0 ⇒ −x ≥ 0.

Claim 3: A nonnegative number a with a nonnegative square root x has a unique nonnegative square root, a unique nonpositive square root, and no other square roots.
Proof: We already have x and, by the previous claim, −x as square roots. Suppose there is a distinct root y, where yx ≠ 0 and y+x ≠ 0. Then (yx)(y+x) = y2x2 = y2−a ≠ 0 ⇒ y2a, a contradiction.

Claim 4: Squaring preserves order on two nonnegative numbers.
Proof: xyx2xyy2; also x < yx2 < xy < y2.

Claim 5: Square-rooting (taking the nonnegative square root) preserves order on two nonnegative numbers with square roots.
Proof: If there were two nonnegative numbers with square roots in a different order, squaring them would contradict the previous claim.

Claim 6: A number a greater than 1 has a positive square root.
Proof: Let a be a real number greater than 1 whose square root we will be interested in. Let S := {x ∈ ℝ | 0 < x2a}. As 1 ∈ S, S is nonempty and thus has a least upper bound.

Let x := sup S and y := (ax+a)/(x+a). If x2 = a, we are done. Henceforth assume x2a.

a>1\implies a^2>a\implies a^2-a>0
1\in S\implies x\ge1>0\implies (x+a)^2>a^2>a^2-a>0
\displaystyle0<\frac{a^2-a}{(x+a)^2}<1

\displaystyle y^2-a=\frac{(ax+a)^2}{(x+a)^2}-a=(x^2-a)\cdot\frac{a^2-a}{(x+a)^2}

If x^2>a, then

x^2-a>0\\  y^2-a<x^2-a\\  y^2<x^2\\  y<x,

but also

y^2-a>0\\  y^2>a\ge z^2\quad\forall z\in S\\  y>z\quad\forall z\in S\\  y\ge\sup S=x,

a contradiction.

If x^2<a, then

x^2-a<0\\  y^2-a>x^2-a\\  y^2>x^2\\  y>x,

but also

y^2-a<0\\  y^2<a\\  y\in S\\  y\le\sup S=x,

again a contradiction.

Therefore x is a positive square root of a.

Claim 7: A positive number a less than 1 has a positive square root.
Proof: By the above, 1/a has a positive square root x. 1/x > 0 and (1/x)2 = 1/x2 = 1/(1/a) = a.

Claim 8: A nonnegative number has a nonnegative square root.
Proof: Apply claims 0, 1, 6 or 7 as necessary.

Claim 9: A nonnegative number a has a unique nonnegative square root, a unique nonpositive square root, and no other square roots.
Proof: By claim 8, a has a nonnegative square root, and by claim 3, it has a unique nonnegative square root, a unique nonpositive square root, and no other square roots.

“Buzzword” is a buzzword

So uh, “e-learning week” in NUS, and I figure many students will be using this week as a second term break.

Me, I’m going through the webcasts of the Linear Algebra II for the lectures I missed, and I missed quite a few classes because of sleep deprivation due to staying up to do English shit. Except once I missed it because I needed time to rewrite/redo my Mathematical Analysis I assignment, and damn am I glad I did that because I completely screwed up the definition of a sequence’s limit in the first two questions.

I have a few things to say about the webcasts. The most annoying thing is that the aspect ratio is wrong. It’s 2015, how do people still manage to get something as simple as aspect ratio wrong? Secondly, I don’t know if this is because I don’t have Silverlight, but I can only view the exact same thing as what’s presented on the computer monitor. The LA2 lecturer uses the visualiser a lot so this makes the webcasts kind of useless, hurr.


Tomorrow I’ll head out and go buy a new hard drive. Someone’s been bugging me to do that (because I can’t be productive with so little free space on my current hard drives and he needs me to be productive).

MA2108S Tutorial 4 question 2 (original)

Idk why Prof Xu decided to “correct” the question to make it easier. It’s perfectly solvable as it is.

Let x:n\mapsto x_n be a sequence such that for any sequence y:n\mapsto y_n, at least one of these equalities hold:
\displaystyle\limsup_{n\to\infty}(x_n+y_n)=\limsup_{n\to\infty}x_n+\limsup_{n\to\infty}y_n\quad\text{(additivity)}
\displaystyle\limsup_{n\to\infty}x_ny_n=\left(\limsup_{n\to\infty}x_n\right)\left(\limsup_{n\to\infty}y_n\right)\quad\text{(multiplicativity)}

Show that x is convergent.

Take y_n:=-x_n. If additivity holds, then
\displaystyle\mathrm{LHS}=\limsup_{n\to\infty}0=0
\displaystyle\mathrm{RHS}=\limsup_{n\to\infty}x_n-\liminf_{n\to\infty}x_n
\displaystyle\implies\limsup_{n\to\infty}x_n=\liminf_{n\to\infty}x_n\\  \implies\lim_{n\to\infty}x_n=\limsup_{n\to\infty}x_n=\liminf_{n\to\infty}x_n.

If multiplicativity holds, then
\displaystyle\mathrm{LHS}=\limsup_{n\to\infty}(-x_n^2)=-\liminf_{n\to\infty}x_n^2=-\left(\liminf_{n\to\infty}|x_n|\right)^2\le0
\displaystyle\mathrm{RHS}=-\left(\limsup_{n\to\infty}x_n\right)\left(\liminf_{n\to\infty}x_n\right).
As the LHS is nonpositive, so is the RHS, which implies
\displaystyle\left(\limsup_{n\to\infty}x_n\right)\left(\liminf_{n\to\infty}x_n\right)\ge0,
which implies that the superior and inferior limits are either both nonpositive or both nonnegative.

If they’re both nonpositive, then \limsup x_n=\liminf |x_n| and \liminf x_n=\limsup |x_n|, and if they’re both nonnegative then \limsup x_n=\limsup |x_n| and \liminf x_n=\liminf |x_n|. Either way,
\displaystyle\left(\liminf_{n\to\infty}x_n\right)\left(\limsup_{n\to\infty}x_n\right)=\left(\liminf_{n\to\infty}|x_n|\right)\left(\limsup_{n\to\infty}|x_n|\right).

Furthermore, note that x is either eventually nonpositive or eventually nonnegative. (This follows trivially from the above and the definition of the superior and inferior limits.) Without loss of generality, assume that x is either everywhere nonpositive or everywhere nonnegative.

\displaystyle\left(\liminf_{n\to\infty}|x_n|\right)^2=\left(\liminf_{n\to\infty}|x_n|\right)\left(\limsup_{n\to\infty}|x_n|\right)\\  \implies\liminf_{n\to\infty}|x_n|=\limsup_{n\to\infty}|x_n|\\  \implies\liminf_{n\to\infty}x_n=\limsup_{n\to\infty}x_n\\  \implies\lim_{n\to\infty}x_n=\liminf_{n\to\infty}x_n=\limsup_{n\to\infty}x_n

QED.

The corrected version of the question merely asks to show that a sequence satisfying the additivity condition is convergent and that a sequence satisfying the multiplicativity condition is convergent, which is considerably easier. Almost too easy compared to all the other questions in this module, even!

Shit’s broken

Some time ago I adopted this mentality where I should prioritise things that’re due/overdue over all those things that are not overdue. Sounds great? Yeah, I thought so too.

What this ended up in was that I shunted all these “lower priority” tasks to “focus” on the shit that’s due, and ended up procrastinating doing other things. (Said procrastination stems from doing lots of things that individually don’t take up a lot of time and hence get fast-tracked by my brain’s scheduling algorithm.) So I’ve done none of the high-priority tasks, and none of the low-priority tasks either! This is arguably a worse outcome than doing none of the high-priority tasks but actually finishing some low-priority ones.

Also, I have a 800-word essay due Monday (i.e. two days ago) that I haven’t started on. Lawd, English classes. But somehow I managed to get the top score in my group for the last essay submission. Not that that’s much of an accomplishment since half the class is ESL, though, and it was only a B anyway.

I don’t know what to do anymore. I don’t think I can keep up with this, but at the same time, I really want to be rid of this nonsense. I guess it helps that I don’t really have to spend much time studying for the four midterm tests I have this Friday? Not that I’d spend that time on the essay.

Like, here I am, rambling on my personal blag instead of doing this rambling on the blag for the English module. (Yes, I now have yet another blog.) What the hell am I doing?

Update (fifteen hours later):

So, I’ve made “some” progress on the researching front. I even woke up early to do that, except I ended up browsing HN and such because ???. Derp. Have I mentioned that another aspect of the English module grading is peer review, and that means I have to read what other people write?

I have a rather high standard for correctness of grammar in what I read. Unfortunately, and this shouldn’t really have been a surprise, these people can’t into grammar. I guess I’m like an outlier in grammar knowledge in the whole class; I’m only in it because I can’t string words to form coherent sentences, maybe? Funnily enough, sometimes it also seems that the tutor is desperate to correct sentences which aren’t even wrong because she misparsed them initially.

It’s easy to do that with English because nouns don’t have ten million forms depending on context, like Latin does. I don’t even remember what all these forms are called. Ergo, Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo. So anyway, sometimes a sentence will have an uncommon structure but have a low edit distance to a common structure, and people will misinterpret it as being a typo of the common structure rather than treating it for what it is.

Also, I am so, so, so screwed. I literally have to finish writing up the eight hundred words by tonight, and I also have that Mathematical Analysis tutorial to go through. Hurrrr.

2D dragging as an interface for 3D rotation

So when I implemented my primitive 3D “renderer”, something I had to think about was how to implement drag rotation. Prior to writing the code, and prior to even thinking about using 3D models for visualising eigenvectors, I had pondered about how drag rotation should work.

In three dimensions, if you specify two unit vectors (normalise them if they’re not unit vectors), there are infinitely many rotations from the first vector to the second, because you can take any such rotation then left-compose it with a rotation around the second axis to get another such rotation. However, in general, there is a unique shortest rotation that follows a great circle path. (It’s not unique exactly when it’s an antipodal rotation.) The axis of this rotation is the cross product of the vectors and the angle around the axis to rotate is the arccosine of the dot product. The resulting rotation can be represented in multiple ways, such as axis-angle, Euler angles, a 3×3 matrix, a quaternion, etc.

In the quaternion case, the above operation of generating a rotation quaternion from two vectors can be done without using any trigonometric functions at all, which is an optimisation that’s apparently used a lot in video games and such. I won’t go into detail about this.

The very first idea I had for 2D dragging was something like this. Say you have some kind of surface z=f(x,y) lying above whatever you’re rendering; then a drag from (x_0,y_0) to (x_1,y_1) would be treated as the shortest rotation from (x_0,y_0,f(x_0,y_0)) to (x_1,y_1,f(x_1,y_1)) after appropriate normalisation. I was being intellectually lazy about what kind of surface to use when making my visualisation demo so I just picked a flat surface f:(x,y)\mapsto1, which seems to work reasonably well/intuitively with my face-transitive models.

Or is the “intuitiveness” a side effect of conditioning because everyone else also uses the same idea to convert 2D drags to 3D rotations, or is it just that the surface used doesn’t really matter all that much? Who knows.

One peculiar result of using the above dragging convention (at least with a flat surface) is that dragging in a circle will rotate the model in the same direction as the dragging. This might or might not be useful, but it’s interesting to note nonetheless.

After this was implemented in my renderer, I added an alternative convention where the rotation was location-independent; a drag from (x,y) to (x+\Delta x,y+\Delta y) would produce the same rotation as a drag from (0,0) to (\Delta x,\Delta y), or, more symmetrically, from (-\Delta x/2,-\Delta y/2) to (\Delta x/2,\Delta y/2). The rotation is then generated by the two vectors (-\Delta x/2,-\Delta y/2,z) and (\Delta x/2,\Delta y/2,v) where z is some constant. (Note that since these two vectors are reflections along the z-axis, their cross product must be perpendicular to the z-axis and hence lie on the xy-plane. This applies to the rotation axis too, since it’s just a scalar multiple of the cross product.)

Unlike the first convention, you can effect an arbitrarily large rotation in a single drag with this convention simply by dragging in one direction far enough; with the first convention, dragging along a straight line passing through the origin results in a net rotation by only π.

This drag→rotation convention also leads to the even stranger result that dragging in a circle will make the model rotate in the opposite direction. Or, I guess it’s not all that strange if you try to think about what dragging in a circle is really doing.

The above seems to suggest that we should be able to interpolate between the two presented conventions so that dragging in a circle leads to zero net rotation. Given the two 2D vectors (x_0,y_0) and (x_1,y_1), define the following vectors, where z is the height of our surface to project the cursor onto.

\displaystyle \mathbf m:=(\tfrac{x_0+x_1}2,\tfrac{y_0+y_1}2)\\  \mathbf d:=(\tfrac{x_1-x_0}2,\tfrac{y_1-y_0}2)

Further define \mathrm{quat}:\mathbb R^3\times\mathbb R^3\to\mathbb H as a function producing a quaternion out of two vectors, normalising them as necessary. The result is undefined if the vectors are degenerate (one of them is zero, they’re parallel/antiparallel, etc.). As an abuse of notation, we’ll use this function on a pair of 2D vectors, in which case you should assume the z-coordinate is 1.

The two conventions are then \mathrm{quat}(\mathbf m-\mathbf d,\mathbf m+\mathbf d) and \mathrm{quat}(-\mathbf d,\mathbf d), from which we see that the obvious way to interpolate this is to use \mathrm{quat}(c\mathbf m-\mathbf d,c\mathbf m+\mathbf d) for some value of c, and indeed, using c=1/2 seems to result in zero net rotation when dragging in circles, or, in other words, the resulting rotation depends only on where you start/stop the drag and is path-independent.

How would one go about proving that this is really path-independent? Quaternion multiplication is not commutative, so we can’t just convert an infinite composition of infinitesimal rotations into an integral of something by taking the logarithm of the associated quaternions.

BTW sandpile, part deux

Because I have way too much free time[citation needed], I spent a couple of hours yesterday Wednesday afternoon optimising my BTW sandpile code to exploit the D4 symmetry inherent in starting with a single “stack” of sand.

All I have to say is that dealing with edge cases is annoying and contributed to the bulk of those hours. It was necessary to duplicate (ugh) the code to handle edge cases separately instead of within the main loop because branch prediction seems to suck in JS. This improved the speed by about 30%.

It runs about four times as fast as the old code; less than the eightfold speedup I expected, but still quite an improvement nonetheless.

[BTW sandpile, 800000 grains]

Here, have an 800000-grain run that took 199.3 seconds. The corresponding 400000-grain run took 50.5 seconds, which fits my heuristic guess of \Theta(S^2) complexity a bit too well. By extrapolation, a 28-million-grain run would take about 1225 times as long, i.e. about 68 hours. Multithreading and SIMD ought to be able to bring this down by a factor of maybe 5, though neither of those are trivial to implement.

Update: here’s a 7-million-grain run, in greyscale and with the R/G/B palette. Note the qualitative similarity with the one on Wikipedia; maybe the sandpile will tend to some fractal-y structure? Run-time logs have also been uploaded.

Update 2: arXiv:1105.0111 [math.AP]

Previously.

httpd

I’ve been running a web server on my Linux-running laptop for years now. (Not as in the uptime, but for how long ago I set it up.) As I’ve mentioned before, this laptop’s gone through multiple Linux installations, most of which I installed nginx on, and since nginx configs are somewhat forwards-compatible, I didn’t have much trouble just copying the config from one install to another. The web server has only ever served static content, which probably helped with that.

Anyway, I had it set up so that roughly the same configs were listening to ports 80 (HTTP) and 443 (HTTPS), but the HTTP server required password authentication. There wasn’t really any particular reason for doing this that I can recall; maybe it was so that only I could connect without encryption. (Back then I had my whole hard disk autoindexed, which is as terrible an idea as it sounds. It’s slightly less dumb with encryption and authentication though…) That said, it’s occasionally amusing to look at my nginx access logs and see people trying to hit port 80 only to get blocked by auth. Also, broken websites that pointed shit to localhost (mostly due to borked DNS) produced HTTP authentication dialogue boxes.

The funny thing is, the last time I reinstalled Linux was 2012, which also means the last time I copied the config from another install was that long ago too. I only just found out I somehow never copied the .htpasswd over and authentication was entirely impossible. I guess it didn’t really make a difference since I never gave the password out and I hardly used it myself.

Ack

Man, having to take English modules even in university sucks big-time.

I knew this was coming, and I still can’t take it. It’s really unfortunate that it just so happens that I have a Wednesday night cartoon to help out in fansubbing and the English classes are on Monday and Thursday mornings.

Writing is something I’m not efficient at, and my habit of doing things at the last minute hasn’t gone away, so now I’m pondering over the various amounts of doom I’ll experience tomorrow if I don’t manage to get shit done within about two hours 1.5 hours. (See, that’s how long it took to write this blog post.)

Holy shit, I am doomed.

Last semester I only really got into trouble with the last PC1141 lab summary which I rushed out the morning of the day it was due. (I somehow got an A for that, but I think that was just sympathy marks, lol. I put in considerably more effort into the earlier lab summaries and only got B/C for those.) Homework for the other modules I was taking wasn’t too hard: Multivariable Calculus was easy, Fundamental Concepts of Mathematics was challenging but doable, Physics I was also challenging but I got help. I guess it worked to my favour that I took only three modules that required doing work outside of the lectures last semester.

But this semester I’m taking six modules, all five of which have homework of some sort. And Mathematical Analysis I has weekly assignments that take over six hours to finish. This is absolutely ridiculous, I tell you. (Hello, 5 MC should be a weekly workload of 12.5 hours, not— actually I guess five hours of lectures/tutorials and six hours of homework doing makes only 11 hours. But still!)

It’s good practice, sure, which actually reminds me of the fact that while I have a fairly solid understanding of the “easy” maths, I haven’t really explored much of higher mathematics, so the laidback attitude I’ve had throughout high school and last semester will probably betray me at some point. (In fact, it already did: I messed up the Multivariable Calculus exam. But that’s a story for another day. I’d also say I botched the Physics I exam but seeing how I got an A overall, other people must’ve done even worse. Thanks, bell curve.)

Oops

Given any polynomial P, P(P(x))-x is always divisible by P(x)-x.

Proof: Factor out P(x)-x into its zeroes. These values are the fixed points of P, and hence also P\circ P, so these values divide P(P(x))-x.

So, uh, I realise five years later that this actually doesn’t really hold water. It works when there are no repeated roots, of course, but what if there are? Then this just means that P(P(x))-x has the same roots as P(x)-x, not that the former has root multiplicity at least as large as the latter (which is what we need to prove).

The proof is easy to mend anyway. Suppose x_0 is some root of P(x)-x with multiplicity m (where m\ge1). We can express P(x)-x as a polynomial in terms of x-x_0; the leading nonzero term (with terms in order of ascending exponent) is thus a_m(x-x_0)^m. To simplify the equations a bit, let y:=x-x_0. Evaluating P(P(x))-x then gives

\displaystyle \begin{array}{lcl}P(P(x))-x&=&P(x+a_my^m+O(y^{m+1}))-x\\  &=&a_m(y+a_my^m+O(y^{m+1}))^m+O((y+a_my^m+O(y^{m+1}))^{m+1})\\  &=&a_m(y^m+my^{m-1}a_my^m+O(y^{2m}))+O(y^{m+1}+my^ma_my^m+O(y^{2m+1}))\\  &=&a_my^m+O(y^{2m-1})+O(y^{m+1})\\  &=&a_my^m+O(y^{\min\{m+1,2m-1\}})  \end{array}

Therefore P(P(x))-x has x_0 as a root with multiplicity at least m. In particular, if m\ge2, then the multiplicity is exactly m.

Easy mode: prove that (P(x)-x)|(P^n(x)-x) for all nonnegative integers n, where the exponent refers to function iteration.

Hard mode: do that without induction. (The above argument adapts very straightforwardly to the general case, but also requires induction on n.)