What is a tensor?

4 3 2 1 -1 -2 -3 -4 -5 1 2 3 4 5 6 7 8 9 10 11 12 8 6 4 2 -2 -4 -6 -8 -10 2 4 6 8 10

Tensors are mathematical objects used throughout physics, maths and engineering. I first came across them while trying to understand the General Theory of Relativity. The descriptions online are usually quite abstract and use lots of symbols, requiring a mathematical background to even begin. They say things like: "a tensor is a geometrical object that describes linear relationships between...". I found these baffling at first, so I tried to come up with a concrete example.

This example is very contrived. There is approximately zero chance that anybody in the history of humanity will ever actually find themselves in the situation I'm about to describe, and then use tensors to resolve it. But it will hopefully be easier to grasp than descriptions like the one above.

Please note: you'll need a modern HTML5 browser to see the graphs on this page. They are done using SVG. And you'll need Javascript enabled to see the equations which are written in Tex and rendered using MathJax and may take a few seconds to run. This page also has some graphs created using Octave, an open-source mathematical programming package. Click here to see the commands.


Imagine that you run a singles club. Your marketing skills leave a lot to be desired and after three months you have recruited 1 female and 2 males. You've sent them on a lot of fun activities, and they are absolutely, definitively not interested in each other. The Alpha Singles Club is not going well. You contact a friend who lives nearby who runs the Beta Singles Club. She is doing slightly better, with 2 females and 3 males.

Because you are mathematically inclined, you decide to graph these two sets of numbers: (1,2) and (2,3):
males females
Those arrows are called vectors, which are also one dimensional tensors.


Now pretend that the Alpha and Beta Singles Clubs decide to do a combined outing. You are going to send them out to sea in double kayaks. Each kayak will have an Alpha SC member and a Beta SC member. You're hoping that out there on the open water, someone will fall in love and leave a good review on your website.

Out of curiosity, you decide to calculate all the possible couple combinations between the two clubs. For example there are 6 male/male possibilities, the 2 Alpha men * the 3 Beta men. Similarly your 1 Alpha female could canoe with either of the 2 Beta women, so there are 2 female/female possibilites. There are also 3 female/male and 4 male/female combos. You put these numbers into a box:

\(\begin{bmatrix}2 & 3\\4 & 6\end{bmatrix}\)

This 2x2 box written in big square brackets is a matrix. It is also a two dimensional tensor.

It is time to introduce some mathematical notation. In their natural state vectors are written vertically. They are essentially a simple type of matrix so they are also written with big square brackets. The notation \(a^1\) means the first element of vector \(a\). The vector for the Alpha Singles Club looks like:

vector \(a = \begin{bmatrix}a^1\\a^2\end{bmatrix} = \begin{bmatrix}1\\2\end{bmatrix}\)

And for the Beta SC:

vector \(b = \begin{bmatrix}b^1\\b^2\end{bmatrix} = \begin{bmatrix}2\\3\end{bmatrix}\)

To get the matrix of kayaking combinations above, you have to compute the vector direct product or dyadic of the vectors \(a\) and \(b\). This is represented by a great little symbol that looks like it belongs at a railroad crossing \(\otimes\). It is found by multiplying all the numbers in the first vector by all the numbers in the second and writing the result in a box:

\(a \otimes b = \begin{bmatrix}a^1\\a^2\end{bmatrix} \otimes \begin{bmatrix}b^1\\b^2\end{bmatrix} = \begin{bmatrix}a^1b^1 & a^1b^2\\a^2b^1 & a^2b^2\end{bmatrix} = \begin{bmatrix}1*2 & 1*3\\2*2 & 2*3\end{bmatrix} = \begin{bmatrix}2 & 3\\4 & 6\end{bmatrix}\)

Can you see how much easier it is to run a singles club if you are familiar with vector multiplication?


Suspend your disbelief again, because three person kayaks are quite uncommon. Imagine that another singles club comes on the scene just before the expedition, the Ceta Singles Club. Ceta SC is a much bigger concern, but they are impressed by your grasp of vectors and they want in. They have 9 women and 7 men on their mailing list. They ask you to come up with all the possible kayaking gender combinations for a joint Alpha, Beta and Ceta outing.

You first produce their vector:

vector \(c = \begin{bmatrix}c^1\\c^2\end{bmatrix} = \begin{bmatrix}9\\7\end{bmatrix}\)

Now you need to compute \((a \otimes b) \otimes c\), which will involve multiplying every number in the 2x2 matrix above by both numbers in the new vector \(c\). It will produce a 2x2x2 cube of numbers, which I have tried my best to represent in HTML. Imagine the grey numbers form the back half of the cube:

\((a \otimes b) \otimes c = \begin{bmatrix}a^1b^1 & a^1b^2\\a^2b^1 & a^2b^2\end{bmatrix} \otimes \begin{bmatrix}c^1\\c^2\end{bmatrix} = \) \( \begin{bmatrix}a^1b^1c^1 & a^1b^2c^1\\a^2b^1c^1 & a^2b^2c^1\end{bmatrix} _\class{matrix3d}{\begin{bmatrix}a^1b^1c^2 & a^1b^2c^2\\a^2b^1c^2 & a^2b^2c^2\end{bmatrix}} =\) \( \begin{bmatrix}1*2*9 & 1*3*9\\2*2*9 & 2*3*9\end{bmatrix} _\class{matrix3d}{\begin{bmatrix}1*2*7 & 1*3*7\\2*2*7 & 2*3*7 \end{bmatrix}} = \begin{bmatrix}18 & 27\\36 &54\end{bmatrix} _\class{matrix3d}{\begin{bmatrix}14 & 21\\28 & 42\end{bmatrix}} \)

So that is 2x2x2 = 8 numbers representing every gender combination. For instance \(a^1b^2c^1\) is the 27 possible combinations of the 1 Alpha female, the 3 Beta males and the 9 Ceta females. That cube is a three dimensional tensor.

If the Delta Singles Club suggested a 4-person paddle boating afternoon, then you could make a four dimensional tensor which would be a 2x2x2x2 hypercube with 16 numbers. It's very hard to visualise a four dimensional hypercube, but you can imagine it as 2 normal cubes floating around in space. The first cube has all of the above multiplied by \(d^1\) and the second by \(d^2\). Similarly if the Epsilon SC ever showed up, the five dimensional tensor would be a 2x2x2x2x2 hypercube with 32 numbers. But that is getting a bit ridiculous because I don't think there are many 5-person water based activities.

Vector Transformations

It would be nice to think that you now know all about tensors. Unfortunately, there's still a long way to go. What they are (hypercubes) is one thing. It's what they do that really matters. And what they do is to enable transformations of co-ordinate systems. If that sounds intimidating, please read on.

Oars'R'Us sell kayaking paraphernelia. They mainly focus on oars, but they also do a nice line in waterproof shoes and snacks. They want to supply these shoes for your expedition, and they will make it worth your while. They ask for all your data on memberships and gender combinations. However, they want it in their own format. They only care about shoes and snacks. They will provide two shoes per person, and (in a sexist cost cutting exercise based on European Recommended Daily Allowances) they offer 3 snacks for each woman and 4 for each man.

In order to provide this information, you'll need to convert/transform your data into their format. Your vectors are \(a\) and \(b\). You'll transform them into \(\tilde{a}\) and \(\tilde{b}\):

\(\tilde{a}^1\) = two shoes per male + two shoes per female = \(2 * a^1 + 2 * a^2 \) = 6 shoes

\(\tilde{a}^2\) = three snacks per female + four snacks per male = \(3 * a^1 + 4 * a^2 \) = 11 snacks

To calculate this more easily, you can put those shoe and snack transformation numbers into a matrix of shoes and snacks:

\(T = \begin{bmatrix}{T^1}_1 & {T^1}_2 \\ {T^2}_1 & {T^2}_2\end{bmatrix} = \begin{bmatrix}2\ shoes\ per\ female & 2\ shoes\ per\ male \\3\ snacks\ per\ female & 4\ snacks\ per\ male \end{bmatrix} = \begin{bmatrix}2 & 2 \\3 & 4 \end{bmatrix}\)

Within T, the first upper number is the row, and the second lower number is the column. Then you can use matrix multiplication to get the answers. When multiplying matrices, you multiply and add each row in the first matrix by each column in the second matrix. So in the multiplication below, you have to multipy the top row of T which is \(\begin{bmatrix}{T^1}_1\ {T^1}_2\end{bmatrix}\) by the values of \(b = \begin{bmatrix}b^1\ b^2\end{bmatrix}\). This is done by multiplying the two values separately and then adding them together like this:

\(\begin{bmatrix}\tilde{b}^1\\\tilde{b}^2\end{bmatrix} = T \begin{bmatrix}b^1\\b^2\end{bmatrix} = \begin{bmatrix}{T^1}_1 & {T^1}_2 \\ {T^2}_1 & {T^2}_2\end{bmatrix} \begin{bmatrix}b^1\\b^2\end{bmatrix} = \begin{bmatrix}({T^1}_1 * b^1) + ({T^1}_2 * b^2) \\ ({T^2}_1 * b^1) + ({T^2}_2 * b^2)\end{bmatrix} = \begin{bmatrix}2*2+2*3\\3*2+4*3\end{bmatrix} = \begin{bmatrix}10\\18\end{bmatrix}\)

Notice how the lower index of T sort of cancels out (through multiplying) with the upper index of b. Why some numbers are superscripts and others subscripts will be explained soon.

There are much friendlier, more grapical descriptions of matrix multiplication available. Take some time to get familiar with that process, because it's about to get a lot lot more involved.

Tensor Transformations

Remember that lovely quiet evening when you computed all the combinations of gender kayaking pairs between Alpha SC and Beta SC, \((a \otimes b)\)? Let's label that tensor \(k^{ab}\). The amazing and wonderful thing about tensors is that you can compute \(\tilde{k}^{ab}\) directly from \(k^{ab}\)! The secret is to multiply the tensor \(k^{ab}\) times T twice. T is called the inverse transformation matrix. Why inverse? That will be revealed in due course. In other not-quite-words:

\(\tilde{k}^{ab} = \begin{bmatrix}\tilde{k}^{11} & \tilde{k}^{12}\\\tilde{k}^{21} & \tilde{k}^{22}\end{bmatrix} = T T k^{ab} = \begin{bmatrix}{T^1}_1 & {T^1}_2\\{T^2}_1 & {T^2}_2\end{bmatrix} \begin{bmatrix}{T^1}_1 & {T^1}_2\\{T^2}_1 & {T^2}_2\end{bmatrix} \begin{bmatrix}k^{11} & k^{12}\\k^{21} & k^{22}\end{bmatrix}\)

Computing this is an extension of matrix multiplication. To compute the element in the first row and first column of \(\tilde{k}^{ab}\), you have to multiply and add the first row in the first T, times the first row in the second T, times every element in \(k^{ab}\).

\(\tilde{k}^{11} = ({T^1}_1 * {T^1}_1 * k^{11}) + ({T^1}_1 * {T^1}_2 * k^{12}) + ({T^1}_2 * {T^1}_1 * k^{21}) + ({T^1}_2 * {T^1}_2 * k^{22})\)

Again notice how some of the ups and downs in T and k sort of cancel each other out. Using the real numbers from the actual kayaking pairs:

\(\tilde{k}^{11} = (2 * 2 * 2) + (2 * 2 * 3) + (2 * 2 * 4) + (2 * 2 * 6) = 60\)

You can double check this by computing \(\tilde{a} \otimes \tilde{b}\) directly:

\(\tilde{k}^{ab} = \tilde{a} \otimes \tilde{b} = \begin{bmatrix}\tilde{a}^1 \\\tilde{a}^2\end{bmatrix} \otimes \begin{bmatrix}\tilde{b}^1 \\\tilde{b}^2\end{bmatrix} = \begin{bmatrix}6\\11\end{bmatrix} \otimes \begin{bmatrix}10\\18\end{bmatrix} = \begin{bmatrix}60 & 108 \\ 110 & 198\end{bmatrix}\)

This works for any number of dimensions. For three dimensional tensors:

\(\tilde{k}^{abc} = T T T k^{abc}\)

And looking at just one of the eight numbers in \(\tilde{k}^{abc}\):

\(\tilde{k}^{111} = ({T^1}_1 * {T^1}_1 * {T^1}_1 * k^{111}) + ({T^1}_1 * {T^1}_1 * {T^1}_2 * k^{112}) + \) six other terms

Summation Notation

Most textbook descriptions of tensor never mention a real actual number like 60. And they quickly dispense even with vectors and matrices in square brackets. They use summation notation with the Greek letter S for Sum. In Greek it's called Sigma and looks like \(\sum\).

For example the sum above \(\tilde{b}^1 = {T^1}_1 * b^1 + {T^1}_2 * b^2\) can be written as:

\(\tilde{b}^1 = \sum_{m=1}^2 {T^1}_m b^m\)

Which says to compute \(T^{1m} b^m\) for m=1 and m=2 and sum/add the result. In this sum, the \(m\) is the index of summation.

There would be a similar sum for \(\tilde{b}^2\) but to avoid having to write it, you can replace the number 1 with a free variable, in this case \(a\). So this:

\(\tilde{b}^a = \sum_{m=1}^2 {T^a}_m b^m\)

Is (very) short for this:

\(\begin{bmatrix}\tilde{b}^1\\\tilde{b}^2\end{bmatrix} = \begin{bmatrix}{T^1}_1 & {T^1}_2\\{T^2}_1 & {T^2}_2\end{bmatrix} \begin{bmatrix}b^1\\b^2\end{bmatrix} \)

And for two dimensional tensors:

\(\tilde{k}^{ab} = \sum_{m=1}^2 \sum_{n=1}^2 {T^a}_m {T^b}_n k^{mn}\)

There are three types of letter used in these sums. \(k\) respresents our tensor. \(a\) and \(b\) are the free variables. They can take on the numbers 1 or 2 and enable us to essentially write 4 differents sums (for each combination of \(a\) and \(b\)) in one line. \(m\) and \(n\) are summation indices. They must be evaluated for the numbers 1 and 2 with the results added together.

In three dimensions:

\(\tilde{k}^{abc} = \sum_{m=1}^2 \sum_{n=1}^2 \sum_{o=1}^2 {T^a}_m {T^b}_n {T^c}_o k^{mno}\)

In all our examples the free variables and summation indices only go up to 2 (females and males, shoes and snacks). But in the General Theory of Relativity, all of these would go up to 4 (the 3 spatial dimensions x, y, z and time). Which means that a 3-dimensional tensor has 4x4x4 elements.

Einstein's Summation Notation

Albert Einstein himself introduced a further abbreviation to the sums above. He said that whenever a summation index is repeated twice on the right hand side of an equation, we can drop the \(\sum\) sign and just assume it is there.

In this expression the \(m\) and \(n\) are each repeated twice on the right:

\(\tilde{k}^{ab} = \sum_{m=1}^2 \sum_{n=1}^2 {T^a}_m {T^b}_n k^{mn}\)

Therefore this can be abbreviated to just:

\(\tilde{k}^{ab} = {T^a}_m {T^b}_n k^{mn}\)

For seven dimensional tensors you can represent a tensor transformation of 2x2x2x2x2x2x2=128 elements with just:

\(\tilde{k}^{abcdefg} = {T^a}_m {T^b}_n {T^c}_o {T^d}_p {T^e}_q {T^f}_r {T^g}_s k^{mnopqrs}\)

In the space and time of general relativity, a seven dimensional transformation like that would actually have 4x4x4x4x4x4x4=16,384 elements.

Independence For All Vectors

Do you remember that T is called the inverse transformation matrix? To find out what T is the inverse of, we first have to discuss vector independence, an issue which affects all vectors everywhere.

An important property of vectors and tensors is that they are independent of any coordinate system. The blue vector in the graph at the beginning respresents the membership of Alpha Singles Club. For convenience, we assigned that independent vector the coordinates (1, 2) for (1 female, 2 males). Oars'R'Us assigned the same vector the coordinates (6, 11) for (6 shoes, 11 snacks). But it's the same vector! It is still the Alpha SC's membership.

Admittedly this is a stretch of the imagination where Alpha Singles Club is concerned, but it makes more sense when you consider the theory of relativity. A vector pointing from the Earth towards the Sun is an arrow in space. Whether we measure it in miles, millimeters or light years, it is still the same vector. And if space and time are bent by the massive gravity of the Sun and the Earth, causing the calculations to go wild as we get too close, forcing us to change coordinate systems to avoid being swallowed by an enormous gravitational pit, it is still the same vector. Alpha SC is a bit simpler than the Solar System, but the same principle applies.

For a visual example, the graph at the beginning counts the number of females on the horizontal access and males on the vertical axis. We can draw two more vectors onto it. The black vectors form the basis of this graph. The horizontal basis is a vector which represents 1 female and 0 males. The vertical basis is a vector which represents 0 females and 1 male:
males females

What happens when we use a different basis? The next graph has a blue vector with coordinates (2, 4) for (2 female hands, 4 male feet). This vector still represents Alpha SC's membership. We can plot this change of coordinates in two ways - by making the vector twice as long, or making the coordinate grid twice as small. We'll do the latter, as it's more in tune with the universe. If we were thinking about the Solar System, a change from kilometres to metres would not make the Sun 1000 times further away. Likewise, whether we graph Alpha SC according to the number of people or extremeties makes no difference to the actual membership. The basis of this graph is (1 female hand, 1 male foot). The bases look small only because the whole coordinate system has shrunk around the consistently sized vectors:
male feet female hands

Basis vectors do not have to be the same length or even perpendicular. The following graph took me ages and several sheets of paper to work out. It shows the membership vectors in the Oars'R'Us coordinate syatem. The basis of this graph is (1 shoe, 1 snack). The blue and red vectors look fairly small in comparison to the basis vectors, but if you zoom in and count, you'll find that the blue Alpha SC vector's arrowhead starts at (6 shoes, 11 snacks) as calculated above. The whole coordinate system has been compressed like an accordian, so that vectors pointing diagonally up/right cover a lot of ground very quickly. Please admire this graph, a lot of scribbling went into it:

Note that the only reason we are able to graph these vectors so nicely is because our transformation also happens to work as a coordinate transformation. This is a confusing coincidence and will be unravelled in the next article on ordinary tensor differentiation.

Transforming Bases

For all of the graphs above, basis vectors are written as rows (rather than columns like normal vectors). Textbooks often use the letter \(i\) for the horizontal basis and \(j\) for the vertical one. For the first graph above:

horizontal basis = 1 female, 0 males = \(i = \begin{bmatrix}i_1 & i_2 \end{bmatrix} = \begin{bmatrix}1 & 0 \end{bmatrix}\)

vertical basis = 0 females, 1 male = \(j = \begin{bmatrix}j_1 & j_2 \end{bmatrix} = \begin{bmatrix}0 & 1 \end{bmatrix}\)

You can then think of the Alpha SC and Beta SC vectors as combinations of the basis vectors. Beta SCs vector is made by adding together 2 of the female basis plus three of the male basis:

\(b = \begin{bmatrix}1 & 0 \end{bmatrix} * 2 + \begin{bmatrix}0 & 1 \end{bmatrix} * 3\) = 2 females and 3 males = \(\begin{bmatrix}2 \\ 3 \end{bmatrix}\)

This can be better portrayed with a matrix multiplication:

\(b = \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix}2 \\ 3 \end{bmatrix} = \begin{bmatrix}(1*2) + (0*3) \\ (0*2) + (1*3) \end{bmatrix} = \begin{bmatrix}2 \\ 3 \end{bmatrix}\)

The matrix on the left with 1s and 0s is the matrix equivalent of the number 1. It's called the identity matrix or I for short.

So how did I figure out how to draw the bases for the other graphs above? The second one was easy, I just made the bases half the length \(\begin{bmatrix}\frac12 & 0 \end{bmatrix}\) and \(\begin{bmatrix}0 & \frac12 \end{bmatrix}\). Meanwhile the Beta SC vector coordinates (4 female hands, 6 male feet) had doubled, so the multiplication still held:

\(b = \begin{bmatrix}\frac12 & 0 \\ 0 & \frac12 \end{bmatrix} \begin{bmatrix}4 \\ 6 \end{bmatrix} = \begin{bmatrix}2 \\ 3 \end{bmatrix}\)

The Oars'R'Us coordinate system was harder. It involved doing some algebra. You worked out above that the number of shoes was equal to 2 * females + 2 * males. You can also work out the reverse. You'll find that the number of females equals 2 * shoes - 1 * snacks. Similarly the number of males = -1.5 * shoes + 1 * snacks. Those are the bases of the third graph above:

\(\tilde{i} = \begin{bmatrix}2 & -1 \end{bmatrix}\)

\(\tilde{j} = \begin{bmatrix}-1.5 & 1 \end{bmatrix}\)

And using matrix multiplication:

\(b = \begin{bmatrix}\tilde{i}_1 & \tilde{i}_2 \\ \tilde{j}_1 & \tilde{j}_2 \end{bmatrix} \tilde{b} = \begin{bmatrix}2 & -1 \\ -1.5 & 1 \end{bmatrix} \begin{bmatrix}10 \\ 18 \end{bmatrix} = \begin{bmatrix}(2 * 10 + -1 * 18) \\ (-1.5 * 10 + 1 * 18) \end{bmatrix} = \begin{bmatrix}2 \\ 3 \end{bmatrix}\)

The matrix at the beginning, the one with the two bases on top of each other is the transformation matrix, usually labelled S. It is the thing that T is the inverse of. It can also describe how the horizontal base vectors get transformed (which isn't surprising as S is constructed using those base vectors). In this case the base vector comes first and its single row is multiplied and added to each column of the transformation matrix:

\(\tilde{i} = i S = \begin{bmatrix} 1 & 0 \end{bmatrix} \begin{bmatrix} 2 & -1 \\ -1.5 & 1 \end{bmatrix} = \begin{bmatrix} (1*2) + (0*-1.5) & (1*-1) + (0*1) \end{bmatrix} = \begin{bmatrix}2 & -1 \end{bmatrix}\)

We can now happily write the following three equations involving S and T:

\(\tilde{i} = i S\)

\(b = S \tilde{b}\)

\(\tilde{b} = T b\)

S is the transformation matrix. It stores how the bases change between coordinate systems and how to convert \(\tilde{b}\) back into \(b\). T is the inverse transformation matrix. It stores how the coordinates of the vectors change between coordinate systems and how to transform \(b\) into \(\tilde{b}\). S and T are the inverse of each other, which means that when you multiply them together:

\(ST = \begin{bmatrix}2 & -1 \\ -1.5 & 1 \end{bmatrix} \begin{bmatrix}2 & 2 \\ 3 & 4 \end{bmatrix} = \begin{bmatrix}(2 * 2 + -1 * 3) & (2 * 2 + -1 * 4) \\ (-1.5 * 2 + 1 * 3) & (-1.5 * 2 + 1 * 4) \end{bmatrix} = \begin{bmatrix}1 & 0 \\ 0 & 1 \end{bmatrix} = I\)


Types of Vectors

That was dramatic! Did you know that there are two types of vectors in this world? Those that vary with the bases and those that don't. In the second graph above, the bases were halved while the vector coordinates doubled. They went in opposite directions.

Vectors that transform in the same way as bases are called covariant vectors. Vectors that go the other way are contravariant vectors. The same applies to tensors. So far, all our kayaking vectors and tensors have been contravariant. All the basis vectors were covariant.

As a simple example, the distance from London to Paris is about 280 miles. If we use kilometers to measure it instead, the distance jumps to 450km. So distance contra-varies with a change in basis. As the basis gets smaller, the numerical distance gets larger. The trip takes about 2.5 hours by train, which equals 0.01 * 280 miles, or 0.006 * 450km. So the factor to calculate the time drops from 0.01 to 0.006, it co-varies with the change in basis. As the basis gets smaller, the time factor also gets smaller.

Contravariant vector and tensor parts are written with superscripts like \(a^1\) and \(b^2\) and \(k^{12}\). They get multiplied by T which is written as \({T^1}_2\) with the superscript of the vector multiplying/adding with the subscript of T. Covariants are written with a subscript like \(i_1\) and \(j_2\) and get multiplied by S which also looks like \({S^1}_2\) but now the subscripts of the vector are multiplied/added by the superscript of S.

Linear Functions

Other than bases, it's difficult to imagine any other covariant vectors. Any Singles Club you can think of, even the global empire of the Omega Singles Club with 57212 females and 52092 males, is still a contravariant vector. But they are out there, and they are similar to the London-Paris example. This section will catch one and mercilessly describe it.

Imagine that you want to ascertain the total number of people in each Singles Club. You could just add the females plus males in your head, but that would be vaguely disappointing after all you've gone through above. So instead you decide to multiply by a vector. You spend several minutes racking your brain and then it suddenly appears in your mind's eye. A horizontal vector, which you affectionally call \(f\). And then you compute the vector inner product or vector dot product:

\(f \cdot a = \begin{bmatrix}1 & 1 \end{bmatrix} \begin{bmatrix}1 \\ 2 \end{bmatrix} = \begin{bmatrix} (1*1) + (1*2) \end{bmatrix} = \begin{bmatrix} 3 \end{bmatrix} = 3\)

The dot product of two vectors is a mini matrix multiplication which produces a number, also known as a scalar and even a zero dimensional tensor. Your \(f\) vector is a linear function of the vector \(a\). It adds up the components of \(a\) and tells you how many people are in each Singles Club.

You can make any vector into a function. If The Singles Club Society decided to award you 10 points for every female and 4 for every male, you could create the vector \(\begin{bmatrix} 10 & 4\end{bmatrix}\). We could say that all the Singles Club vectors like \(a\) and \(b\) exist in the Singles Club vector space. Then all these horizontal function vectors exist in the dual space of the Singles Club vector space. This is like a mysterious alternate reality where functions wander around looking for vectors to operate on.

And best of all these functions are covariant vectors. You can tranform them by multiplying by S, in the same way that the horizontal base vectors were multiplied by S. To convert \(f\) into the Oars'R'Us shoe/snack coordinate system:

\(\tilde{f} = f S = \begin{bmatrix} 1 & 1 \end{bmatrix} \begin{bmatrix} 2 & -1 \\ -1.5 & 1 \end{bmatrix} = \begin{bmatrix} (1*2) + (1*-1.5) & (1*-1) + (1*1) \end{bmatrix} = \begin{bmatrix}0.5 & 0 \end{bmatrix}\)

\(\tilde{f}\) is a vector in the Oars'R'Us dual space. And now watch what happens when we apply the transformed function to an Oars'R'Us vector:

\(\tilde{f} \cdot \tilde{a} = \begin{bmatrix}0.5 & 0 \end{bmatrix} \begin{bmatrix}6 \\ 11 \end{bmatrix} = \begin{bmatrix} (0.5*6) + (0*11) \end{bmatrix} = 3\)

The function still works! And it makes sense logically too. Given an Oars'R'Us vector of shoes and snacks, to find the number of people we just need to take half the number of shoes, which is exactly what \(\begin{bmatrix}0.5 & 0 \end{bmatrix}\) does.


Thanks to Albert Einstein's own The Meaning of Relativity book for this section (added 31/12/2015). He made me realise that a "linear function" is really just a line. For example, the equation: y= 2x + 1 can be written in covariant vector/tensor form as:

\(\begin{bmatrix} -2 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = 1\)

Similarly a circle can be expressed as a two dimensional tensor:

\(\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} \begin{bmatrix} x \\ y \end{bmatrix} = 1\)

If you multiply this out, you get:

\(x^2 + y^2 = 1\)

This tensor has two covariant dimensions. For example, to convert from the female/male the hands/feet system, we would have to multiply by S twice. This would leave us with:

\(\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \begin{bmatrix} 0.5 & 0 \\ 0 & 0.5 \end{bmatrix} \begin{bmatrix} 0.5 & 0 \\ 0 & 0.5 \end{bmatrix} = \begin{bmatrix} 0.25 & 0 \\ 0 & 0.25 \end{bmatrix} \)

Which works out to the equation:

\(0.25 x^2 + 0.25 y^2 = 1\)

Or if we multiply everything by 4:

\(x^2 + y^2 = 4\)

This is the equation of a circle with radius=2. We can plot the circle in both coordinate systems and see that it is the same, just like the vectors we plotted previously:

males females male feet female hands

In this way any circle or ellipse can be represented using a two dimensional covariant tensor. If we allow the imaginary square root of -1, then it can do hyperboles too. This means that a two dimensional tensor can describe any (second degree) surface which has \(x^2\), \(xy\) and \(y^2\). Similarly a three dimensional tensor can represent third degree surfaces with \(x^3\), \(x^2y\) etc.

Transforming Using S

As with the kayaking pairs \(a \otimes b\), we can also create the vector direct product of two covariant vectors. The result is a two dimensional tensor which is transformed by multiplying by S twice:

\(\tilde{f} = f S S\)

You saw above that contravariant tensors are computed by multiplying and adding the rows of T by every element in the tensor. Covariant tensors multiply and add the columns of S by every element in \(f_{ab}\). So the first element of \(\tilde{f}_{11}\) is the first column in the first S, times the first column in the second S, times every element in \(f_{ab}\), all added together:

\(\tilde{f}_{11} = (f_{11} * {S^1}_1 * {S^1}_1) + (f_{12} * {S^1}_1 * {S^2}_1) + (f_{21} * {S^2}_1 * {S^1}_1) + (f_{22} * {S^2}_1 * {S^2}_1)\)

In summation notation:

\(\tilde{f}_{ab} = \sum_{m=1}^2 \sum_{n=1}^2 f_{mn} {S^m}_a {S^n}_b\)

And in Einstein's abbreviated notation:

\(\tilde{f}_{ab} = f_{mn} {S^m}_a {S^n}_b\)

Tensor Type

However, the tensor \(f_{ab}\) is still just a 2x2 box of numbers. How can you tell by looking at it whether it is contravariant, and transforms using T, or covariant using S? The answer is that you can't. You have to know the tensor's type aka valency. This is written as two numbers in parentheses. First comes the contravariant rank and then the covariant rank.

A simple one-dimensional contravariant vector like the Alpha SC membership vector is a tensor of type (1,0). The covariant people counting function above is of type (0,1). The kayaking pairs tensor \(k^{ab}\) is a type (2,0) tensor, and the \(f_{ab}\) just described is (0,2). Furthermore, contravariants are written using superscripts and covariants with subscripts.

It is also possible to have a mixed rank tensor. Consider an expanded function which computes two things at the same time. This function will produce a vector. The first number will be the number of people. The second number will be the Singles Club Society's esoteric points system. We'll call the resulting vector \(r\). Like this:

\(r = f a = \begin{bmatrix}1 & 1 \\ 10 & 4 \end{bmatrix} \begin{bmatrix}1 \\ 2 \end{bmatrix} = \begin{bmatrix} (1*1) + (1*2) \\ (10*1) + (4*2) \end{bmatrix} = \begin{bmatrix} 3 \ people \\ 18 \ points \end{bmatrix} \)

So how is this new function f transformed into another coordinate system? Well, we know what \(r\) and the transformed \(\tilde{r}\) are, and we know something about \(a\) as well:

\(r = f a\)

\(\tilde{r} = T r \)

\(\tilde{r} = \tilde{f} \tilde{a} \)

\(a = S \tilde{a}\)

Putting this all together:

\(\tilde{r} = \tilde{f} \tilde{a} = T f a = T f S \tilde{a}\)

Which means that \(\tilde{f} = T f S\) and our new \(f\) is transformed by both T and S and is therefore a tensor of type (1,1). I found this strange at first, but I've tried it and followed through the maths and it works:

\(\tilde{r} = T r = \begin{bmatrix}2 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix}3 \\ 18 \end{bmatrix} = \begin{bmatrix}42 \\ 81 \end{bmatrix} \)

\(\tilde{r} = \tilde{f} \tilde{a} = T f S \tilde{a} = \begin{bmatrix}2 & 2 \\ 3 & 4 \end{bmatrix} \begin{bmatrix}1 & 1 \\ 10 & 4 \end{bmatrix} \begin{bmatrix}2 & -1 \\ -1.5 & 1 \end{bmatrix} \begin{bmatrix}6 \\ 11 \end{bmatrix} = \begin{bmatrix}29 & -12 \\ 57.5 & -24 \end{bmatrix} \begin{bmatrix}6 \\ 11 \end{bmatrix} = \begin{bmatrix}42 \\ 81 \end{bmatrix} \)

What does this result actually mean? It represents 42 shoes and 81 snacks but has nothing to do with people and points. And it doesn't need to. It only has meaning if you transform it back into the Singles Club coordinate system.

One interesting feature is that our one-dimensional people-adding function produced the same answer (3 people) in both coordinate systems. Why then and not now? Observe what happens with tensor transformations as we count down the dimensions:

3 dimensions type (3,0): \(\tilde{k}^{abc} = T T T k^{abc}\)

2 dimensions type (2,0): \(\tilde{k}^{ab} = T T k^{ab}\)

1 dimensional vector type (1,0): \(\tilde{k}^{a} = T k^{a}\)

0 dimensional scalar type (0,0): \(\tilde{k} = k\)

A scalar is also a zero dimensional tensor and does not get transformed. The gratifyingly quick and easy result from above was just a coincidence of using a very simple function.

Because of how they are indexed, I think that that \({T^a}_b\) and \({S^a}_b\) can also be considered as mixed tensors of rank (1,1).

Tensor Products And Contraction

Several basic operations are possible on tensors. Tensors of the same type can be added easily, by adding all their individual numbers. Tensors can be multiplied by a scalar by multipling every number by the scalar.

Multiplying tensors times other tensors is much more complex. It involves two main operations: products and contraction. We've seen both of these before, but this section goes into more detail.

First of all, I'll introduce a new approach to the little letters on the top which come from Einstein's summation notation. For example let's say we have two one-dimensional vectors \(v\) and \(w\). They each have just one index (aka dimension). If we use a different letter for each index, it means that each number in vector \(v\) must be multiplied separately by each number in \(w\). This is the tensor product. For vectors, it is the same as the vector direct product. The result is a tensor/matrix with two dimensions:

\(v^a w^b = v \otimes w = \begin{bmatrix} v^1 w^1 & v^1 w^2 \\ v^2 w^1 & v^2 w^2 \end{bmatrix} \)

The product of two higher dimensional tensors is formed in the same way. It means that if a (2,1) type tensor is multiplied by a (1,3) type tensor, the result will be a (3,4) type tensor. All the numbers are multiplied separately, and the ranks are added together.

The opposite of this is contraction. This involves contracting two tensor dimensions into a single number. It took me a while to understand this, but it is pretty much what we've been doing all along. For vectors it is the same as the vector inner product. The components of each vector are multiplied and added and the result is a single number. So one dimension from each vector contracts together into a single number. Contraction is indicated when the little letters are repeated. For example this equation contains two a indices:

\(v^a w^a = v \cdot w = v^1 w^1 + v^2 w^2 \)

The confusing thing is that \(w\) is the same in both cases, whether we write \(w^a\) or \(w^b\). The only thing that changes is the operation.

We can use the same syntax for multiplying matrices times vectors. For example, the multiplication below would result in a 3 dimensional tensor with entries for every number in \({T^a}_b\) times every number in \(v^{c}\):

\({T^a}_bv^c = {R^{ac}}_b\)

But we can change the \(c\) to a \(b\) and it becomes a standard matrix multiplication:

\({T^a}_bv^b = R^{a}\)

The a index in \({T^a}_b\) represents the row number, and the b is the column number. To make this more explicit:

\(\begin{bmatrix}{T^{a=1}}_{b=1} & {T^{a=1}}_{b=2} \\ {T^{a=2}}_{b=1} & {T^{a=2}}_{b=2}\end{bmatrix} \begin{bmatrix}v^{b=1} \\ v^{b=2} \end{bmatrix} = \begin{bmatrix}R^{a=1} \\ R^{a=2} \end{bmatrix}\)

Every b in \({T^a}_b\) is multiplied and added to the corresponding b in \(v^b\). The rows of \({T^a}_b\) are multiplied and added with the single column of \(v^b\). Everything labelled b is swallowed up (aka contracted) out of existence. Only the row numbers labelled a of \({T^a}_b\) survive.

The result is a one-dimensional vector \(R\) with rows going down labelled \(a\). Note that the indices on the left and right side of an equation are completely separate. They are only placeholders to count and indicate rows/columns. They actually have nothing to do with each other. Only by convention do they start with \(a\) on both sides. We could just have validly written:

\({T^a}_b v^b = R^{z}\)

Or even changed the order:

\(v^m {T^f}_m = R^{p}\)

They all mean the same thing. All that matters is that the \(b\) or \(m\) is repeated on the left, which means a contraction.

Full two dimensional matrix multiplication can also be written this way. In this case, two contravariant kayaking pairs are multiplied, so the indices aren't up and down, but the b is still repeated on the left which indicates a contraction:

\(k^{ab}m^{bc} = R^{ab}\)

This multlipies/adds the rows of \(k^{ab}\) with the columns of \(m^{bc}\). The result is a two dimensional matrix, with b contracted away.

This can be extended further to fully describe the way that the kayaking pairs matrix was doubly transformed. Long ago we had:

\(\tilde k^{ab} = T T k^{ab}\)

And after that I gave a long explanation about how to perform that multiplication, by multiplying and adding each row of each T with different elements of \(k^{ab}\). All the gory details of that tensor multiplication can be written succinctly as:

\(\tilde k^{ab} = {T^a}_b {T^c}_d k^{bd}\)

As above, when a letter index appears twice on the same side of an equation (the right side this time), it indicates a contraction. In this equation the rows of \({T^a}_b\) are multiplied/added with the rows of \({T^c}_d\) and then with the rows and columns of \(k^{bd}\). The repeated indices (the columns b and d) are contracted to nothingness. Remember that \(T\) is the same each time, it's just the index placeholders that change. The result is a two dimensional tensor/matrix.

This syntax can describe as many transformations as needed. For instance, for a rank (2,1) tensor:

\(\tilde {k^{ab}}_c = {T^a}_b {T^c}_d {k^{bd}}_e {S^e}_f \)

Remember the abc on the left are completely unrelated to the abc on the right. The repetition of the little letters on the right tells us all we need to know about how to compute this complex transformation.

A tensor can also contract with itself in a similar manner. One of the contravariant indices and covariant indices can be multiplied and added. For 2x2 matrices this is also called a trace operation. For example, take our type (1,1) functional tensor \(f\). We can give it indices:

\({f^a}_b = \begin{bmatrix}1 & 1 \\ 10 & 4 \end{bmatrix}\)

But if we make both indices the same, the matrix will instantly contract into a single number:

\({f^a}_a = 1 + 4 = 5\)

Which is analogous to going straight from \(v^a w^b\) to \(v^a w^a\). And in fact it contracts to the same number even after transformation:

\({{\tilde f}^a}_b = \begin{bmatrix}29 & -12 \\ 57.5 & -24 \end{bmatrix}\)

\({{\tilde f}^a}_a = 29 - 24 = 5\)

I hope that explains tensor multiplication clearly. I rewrote this section nearly two months after the rest of the article because I didn't understand contraction at first. But it is used all over the place with tensors, so it's very good to have a firm grasp of what it does.


There are a couple of other useful things to say about tensors. A tensor is symmetric when two of its indices can be swapped. For example when the lower left and upper right corner are the same. In other words when:

\(T_{ab} = T_{ba} \)

Our circle from above is an example of a symmetric tensor because the lower left and upper right values are both zero:

\(C_{ab} = C_{ba} = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix} \)

General relativity involves a very important 4x4 symmetric tensor which describes the curvature of space-time. It appears on the left side of the equation for general relativity and is called the Einstein tensor G. It's 4x4 so it has 16 entries, but the bottom left and top right values are the same, so it actually only has 10 independent values. So:

\(G_{ab} = G_{ba} = \begin{bmatrix} G_{11} & G_{21} & G_{31} & G_{41} \\ G_{21} & G_{22} & G_{32} & G_{42} \\ G_{31} & G_{32} & G_{33} & G_{43} \\ G_{41} & G_{42} & G_{43} & G_{44} \end{bmatrix} \)

There is another even better type of symmetry called skew symmetry. This happens when:

\(T_{ab} = -T_{ba} \)

In skew symmetric tensors, the top right corner is the negative of the bottom left. But the definition also means that all the entries on the diagonal must be 0. This is because only 0 can be its own negative. For example in the skew-symmetric tensor below \(T_{11} = -T_{11} = 0\):

\(T_{ab} = -T_{ba} = \begin{bmatrix} 0 & 2 \\ -2 & 0 \end{bmatrix} \)

In the four space-time dimensions of general relativity (x, y, z and time) a 4x4 skew-symmetric tensor would therefore only have 6 independent values instead of 16:

\(T_{ab} = -T_{ba} = \begin{bmatrix} 0 & -T_{21} & -T_{31} & -T_{41} \\ T_{21} & 0 & -T_{32} & -T_{42} \\ T_{31} & T_{32} & 0 & -T_{43} \\ T_{41} & T_{42} & T_{43} & 0 \end{bmatrix} \)


Obviously, tensors are very useful if you are running a singles club. But they are essential for understanding the General Theory of Relativity. Hopefully this article has given a solid basic introduction.

I mostly used two sources to understand tensors myself: chapter 1 of an online tutorial A Gentle Introduction to Tensors by Boaz Porat and chapter 5 of a textbook Introducing Einstein's Relativity Ray D'Inverno. Neither source was easy. I got completely lost 10 pages into the first one and 5 pages into the second. I had to do lots of browsing and oodles of scribbling on bits of paper to figure it out. Even then, in both cases, I only understood one chapter out of three. So what has been covered here is still just a small part of all that tensors can be.

I hope you have found this tutorial useful. If you have any comments or corrections please let me know. The next article is about ordinary tensor differentiation.