“Organizing, Visualizing, and Describing Data” is a Level I reading in the CFA Program. It covers the following learning outcomes.
When data is visualized, or put into a pictoral or graphical format, it is often easier to understand and analyze. Visualizations can be used to present frequency distributions to discover multi-dimensional relationships or to help interpret unstructured data.
Just as numerical data can be presented in a histogram, categorical data can be presented in a bar chart. The length or height of each bar represents the frequency of observations in that category.
A word cloud is a useful way to present unstructured textual data. A word cloud consists of words extracted from a source text, with the size of each word corresponding to the frequency it appears in the source text. Common words such as “a” or “it” are generally excluded to focus on more meaningful information.
A treemap is a graphical tool for displaying categorical data. It uses colored rectangles to represent distinct categories, with the area of each rectangle being proportional to the value of the corresponding category.
A histogram is a chart that presents numerical data. In a histogram, the height of each bar or column represents the absolute frequency of each category in the distribution.
Contingency tables display the frequency distributions of two or more variables at once. They are useful for finding patterns among the variables.
A frequency distribution displays data in a table. It can be constructed by counting the observations of a variable belonging to a distinct group or having a given value.
Data must typically be cleanly formatted to be useful for quantitative analysis. Depending on the number of variables, it can be arranged into one-dimensional arrays or two-dimensional rectangular arrays.
Data can be any collection of information, whether in the form of numbers, text, images, audio, etc. It can be classified across three dimensions: numerical vs. categorical; cross-sectional vs. time-series vs. panel; and structured vs. unstructured.