As you speak, vibratory energy is produced by vocal 
folds housed in your larynx.
As you generate vibrations while speaking, part of that energy is transmitted to the tissues and bones in the local area, e.g. laryngeal musculature, tongue, mandible, and bones of the skull (see figure 1 - Laryngeal area).
Figure 1 - Laryngeal Area
As indicated in figure 2, two of these bone structures - mandible and temporal bone - are in very close proximity the ear canal (Note: You can feel this relationship by placing your finger in your ear and moving your jaw).
During speech, this tomporomandibular joint vibrates and some of the energy produced is transferred to the
cartilaginous skeleton surrounding the ear canal (see figure 3). The movement of this cartilage results in sound being produced in the canal and, energy of your speech contains certain low frequencies that are not transmitted via air conduction, our voices sound more resonant to ourselves than to others. This explains why recordings of your voice sound different than your-monitored speech.


Figure 2 (External acoustic meatus - ear canal) Figure 3 ( Temporomandibular joint)