Chapter 8 Basis and Change of Basis
When we think of coordinate pairs, or coordinate triplets, we tend to think of them as points on a grid where each axis represents one of the coordinate directions:

Figure 8.1: The Coordinate Plane
We may not have previously formalized it, but even in this elementary setting, we are considering these points (vectors) as linear combinations of the elementary basis vectors
We consider the coefficients (the scalars 2 and 3) in this linear combination as coordinates in the basis
We can also view Equation (8.1) as a way to separate the vector

Figure 8.2: Orthogonal Projections onto basis vectors
Definition 8.1 (Elementary Basis) For any vector
The elementary basis
However, there are many (infinitely many, in fact) ways to represent the data points on different axes. If I wanted to view this data in a different way, I could use a different basis. Let’s consider, for example, the following orthonormal basis, drawn in green over the original grid in Figure 8.3:

Figure 8.3: New basis vectors,
The scalar multipliers
If we want to change the basis from the elementary
This is merely a system of equations
The
This result tells us that in order to reach the red point (formerly known as (2,3) in our previous basis), we should travel
In the same fashion, we can re-write all 3 of the red points on our graph in the new basis by solving the same system simultaneously for all the points. Let
Then the new data coordinates on the rotated plane can be found by solving:
Using our new basis vectors, our alternative view of the data is that in Figure 8.4.

Figure 8.4: Points plotted in the new basis,
In the above example, we changed our basis from the original elementary basis to a new orthogonal basis which provides a different view of the data. All of this amounts to a rotation of the data around the origin. No real information has been lost - the points maintain their distances from each other in nearly every distance metric. Our new variables,
In general, we can change bases using the procedure outlined in Theorem 8.1.
Theorem 8.1 (Changing Bases) Given a matrix of coordinates (in columns),
Note that when our original basis is the elementary basis,
When our new basis vectors are orthonormal, the solution to this system is simply
Definition 8.2 (Basis) A basis for an arbitrary vector space
A basis for the vector space
The preceding discussion dealt entirely with bases for
8.1 Vector Space Models
The knowledge of coordinates and bases in data science allows us to explore the data using a vector space model. In a vector space model, the objects/observations of interest are considered vectors in a vector space and then often the dimensionality of that vector space is reduced in some way - yielding a new set of basis vectors (a smaller set) and coordinates of the data along those new basis vectors. Let’s take a look at such a model using some text data as an example. The finer details of the text-mining process can be found in Chapter 16. Figure 8.5 shows 4 short “documents.” The words highlighted in red are the words that appear in at least two documents. We’ll use those red-highlighted words as the variables in our analysis, and the documents as the observations.

Figure 8.5: Four Short Documents
We’ll create a matrix, called a term-document matrix, where the
These documents, upon formulation of the term-document matrix above, live in the 6-dimensional term space. We could expand each document in a coordinate-basis representation where each of the elementary axes represents a single term, for instance:
Now, we will change the basis and create new basis vectors that attempt to approximate this data in only 2 dimensions using a matrix factorization. Matrix factorizations are extremely powerful for data analysis, but we are not presently concerned with how they are created or chosen to solve a problem. Our goal in this section is merely to interpret the output and catch a glimpse of their utility. Below we examine one such matrix factorization, which was created using a Nonnegative Matrix Factorization (NMF) algorithm [55].
The “factors” in the left matrix live in the term space. The factors can be considered as linear combinations of our original term vectors:
Thus, these “factors” can be thought of almost like documents: they are merely a collection of terms. In particular, it seems as though
Furthermore, the “scores” in the right matrix provide information about how each document relates to each of these topics:
The above representation allows us to conclude that document 2 is mostly about factor/topic 1 (“pets”) and document 4 is mostly about factor/topic 2 (“injuries”) just by identifying the largest coordinate for each document.
A matrix factorization gives us information about both the rows and columns of the matrix of interest. Here, we saw both the terms and documents collected into two topics according to the loadings (left matrix) and the scores (right matrix).
We strongly encourage the reader to convince themselves of our conclusions and to replicate this analysis by completing the related question in the exercises.
8.2 Exercises
-
What are the coordinates of the vector
in the basis ? Draw a picture to make sure your answer lines up with intuition. -
In the following picture what would be the signs (+/-) of the coordinates of the green point in the basis
? Pick another point at random and answer the same question for that point. - Write the orthonormal basis vectors from exercise 1 as linear combinations of the original elementary basis vectors.
-
What is the length of the orthogonal projection of
onto ? -
Interpret the following Nonnegative Factor Output for a term-document matrix of a small collection of text documents, answering the following questions:
- What meaning (theme/topic) would you give to each of the three factors?
- What is the dominant factor (theme/topic) for each document?
- What is the loading of the word baseball on Factor 2?
- What is the coordinate/score of document 5 along Factor 3?