diff --git a/input/chapter01/chapter01.xml b/input/chapter01/chapter01.xml
index 252a2cdb..ba8f169a 100644
--- a/input/chapter01/chapter01.xml
+++ b/input/chapter01/chapter01.xml
@@ -411,7 +411,7 @@
           <emphasis>kilo binary</emphasis> byte (shortened to KiB).
           The other prefixes have a similar prefix (Mebibyte, MiB, for
           example).  Tradition largely prevents use of these terms,
-          but you may seem them in some literature.</para>
+          but you may see them in some literature.</para>
         </section>
         <section>
           <info>
@@ -1766,7 +1766,7 @@
 	  <computeroutput>2^<superscript>8</superscript> =
 	  256</computeroutput> possible values; with our sign bit
 	  representation we could represent -127 thru 127 but with
-	  two's complement we can represent -127 thru 128.  This is
+	  two's complement we can represent -128 thru 127.  This is
 	  because we have removed the problem of having two zeros;
 	  consider that "negative zero" is <computeroutput>(~00000000
 	 +1)=(11111111+1)=00000000</computeroutput> (note
diff --git a/input/chapter02/chapter02.xml b/input/chapter02/chapter02.xml
index ded47365..2c260986 100644
--- a/input/chapter02/chapter02.xml
+++ b/input/chapter02/chapter02.xml
@@ -519,7 +519,7 @@
 	  instructions also allows for alternate implementations which
 	  may take advantage of the nature of instruction streaming;
 	  they are read-only so do not need expensive on-chip features
-	  such as multi-porting, nor need to handle handle sub-block
+	  such as multi-porting, nor need to handle sub-block
 	  reads because the instruction stream generally uses more
 	  regular sized accesses.</para>
       <figure xml:id="cache_associativity">
@@ -584,7 +584,7 @@
 	      particular address could be located in any way. Thus an
 	      <emphasis>n</emphasis>-way set associative cache will
 	      allow a cache line to exist in any entry of a set sized
-	      total blocks mod n &#x2014; <xref linkend="cache_associativity"/> shows a sample
+	      total blocks n &#x2014; <xref linkend="cache_associativity"/> shows a sample
 	      8-element, 4-way set associative cache; in this case the
 	      two addresses have four possible locations, meaning only
 	      half the cache must be searched upon lookup.  The more
@@ -655,8 +655,8 @@
             </imageobject>
             <caption>
               <para>Tags need to be checked in parallel to keep
-latency times low; more tag bits (i.e. less set associativity)
-requires more complex hardware to achieve this. Alternatively more set
+latency times low; more tag bits (i.e. more set associativity)
+requires more complex hardware to achieve this. Alternatively less set
 associativity means less tags, but the processor now needs hardware to
 multiplex the output of the many sets, which can also add latency.
 </para>
@@ -735,7 +735,7 @@ multiplex the output of the many sets, which can also add latency.
 	a separate chip that is part of the motherboard which buffers
 	and communicates interrupt information to the main processor.
 	Each device has a physical <emphasis>interrupt line</emphasis>
-	between it an one of the PIC's provided by the system.  When
+	between it and one of the PIC's provided by the system.  When
 	the device wants to interrupt, it will modify the voltage on
 	this line.</para>
         <para>A very broad description of the PIC's role is that it
@@ -775,9 +775,9 @@ multiplex the output of the many sets, which can also add latency.
         </figure>
         <para>Most drivers will split up handling of interrupts into
 	<emphasis>bottom</emphasis> and <emphasis>top</emphasis>
-	halves.  The bottom half will acknowledge the interrupt, queue
+	halves.  The top half will acknowledge the interrupt, queue
 	actions for processing and return the processor to what it was
-	doing quickly.  The top half will then run later when the CPU
+	doing quickly.  The bottom half will then run later when the CPU
 	is free and do the more intensive processing.  This is to stop
 	an interrupt hogging the entire CPU.</para>
         <section>
@@ -836,7 +836,7 @@ multiplex the output of the many sets, which can also add latency.
 	  indicate to the PIC that an interrupt has occurred, which it
 	  will signal to the operating system for handling.  However,
 	  if further pulses come in on the already asserted line from
-	  another device.</para>
+	  another device.xxx</para>
           <para>The issue with level-triggered interrupts is that it
 	  may require some considerable amount of time to handle an
 	  interrupt for a device.  During this time, the interrupt
@@ -1030,7 +1030,7 @@ multiplex the output of the many sets, which can also add latency.
       system.</para>
       <para>The symmetric term refers to the fact that all the CPUs in
       the system are the same (e.g. architecture, clock speed).  In a
-      SMP system there are multiple processors that share other all
+      SMP system there are multiple processors that share all
       other system resources (memory, disk, etc).</para>
       <section>
         <info>
@@ -1463,7 +1463,7 @@ multiplex the output of the many sets, which can also add latency.
       that is a fence that allows any load or stores to be done before
       it (move upwards), but nothing before it to move downwards past
       it.  Thus, when load or store with release semantics is
-      processed, you can be store that any earlier load or stores will
+      processed, you can be sure that any earlier load or stores will
       have been complete.</para>
       <figure>
         <info>
diff --git a/input/chapter03/chapter03.xml b/input/chapter03/chapter03.xml
index fe46fa4e..8899ed31 100644
--- a/input/chapter03/chapter03.xml
+++ b/input/chapter03/chapter03.xml
@@ -289,7 +289,7 @@
 	simplest case, a small <emphasis>virtual machine monitor
 	</emphasis> can run directly on the hardware and provide an
 	interface to the guest operating systems running on top.  This
-	VMM is often often called a hypervisor (from the word
+	VMM is often called a hypervisor (from the word
 	"supervisor")<footnote><para>In fact, the hypervisor shares
 	much in common with a micro-kernel; both strive to be small
 	layers to present the hardware in a safe fashion to layers
@@ -426,7 +426,7 @@
 	is <computeroutput>read()</computeroutput>, etc.</para>
         <para>The <emphasis>Application Binary Interface</emphasis>
 	(ABI) is very similar to an API but rather than being for
-	software is for hardware.  The API will define which register
+	software is for hardware.  The ABI will define which register
 	the system call number should be put in so the kernel can find
 	it when it is asked to do the system call.</para>
       </section>
@@ -815,7 +815,7 @@
           <para>The 386 protection model has four rings, though most
 	  operating systems (such as Linux and Windows) only use two
 	  of the rings to maintain compatibility with other
-	  architectures that do now allow as many discrete protection
+	  architectures that do not allow as many discrete protection
 	  levels.</para>
           <para>386 maintains privileges by making each piece of
 	  application code running in the system have a small
diff --git a/input/chapter04/chapter04.xml b/input/chapter04/chapter04.xml
index bdae8a83..f6cd689e 100644
--- a/input/chapter04/chapter04.xml
+++ b/input/chapter04/chapter04.xml
@@ -274,7 +274,7 @@
 	<emphasis>brk</emphasis>, so called for the system call which
 	modifies it.  By using the
 	<computeroutput>brk</computeroutput> call to grow the area
-	downwards the process can request the kernel allocate
+	upwards the process can request the kernel allocate
 	more memory for it to use.</para>
         <para>The heap is most commonly managed by the
 	<computeroutput>malloc</computeroutput> library call.  This
@@ -506,7 +506,7 @@
 	provides a level of abstraction in how the Linux kernel can
 	create processes.</para>
         <para><computeroutput>clone</computeroutput> allows you to
-	explicitly specify which parts of the new process are copied
+	explicitly specify which parts of the old process are copied
 	into the new process, and which parts are shared between the
 	two processes.  This may seem a bit strange at first, but
 	allows us to easily implement <emphasis>threads</emphasis>
@@ -778,7 +778,7 @@
       <para>UNIX systems assign each process a
       <emphasis>nice</emphasis> value.  The scheduler looks at the
       nice value and can give priority to those processes that have a
-      higher "niceness".</para>
+      less "niceness".</para>
     </section>
     <section>
       <info>
@@ -796,7 +796,7 @@ given increasing inputs.  If the algorithm takes twice as long to run
 for twice as much input, this is increasing linearly.  If another
 algorithm takes four times as long to run given twice as much input,
 then it is increasing exponentially.  Finally if it takes the same
-amount of time now matter how much input, then the algorithm runs in
+amount of time no matter how much input, then the algorithm runs in
 constant time.  Intuitively you can see that the slower the algorithm
 grows for more input, the better it is.  Computer science text books
 deal with algorithm analysis in more detail.</para></footnote>.</para>
diff --git a/input/chapter05/chapter05.xml b/input/chapter05/chapter05.xml
index 2eb7672f..e9b2993e 100644
--- a/input/chapter05/chapter05.xml
+++ b/input/chapter05/chapter05.xml
@@ -34,7 +34,7 @@
     to load from stored in a register.  For example, registers that
     are 32 bits wide can hold addresses in a register range from
     <computeroutput>0x00000000</computeroutput> to
-    <computeroutput>0xFFFFFFF</computeroutput>.
+    <computeroutput>0xFFFFFFFF</computeroutput>.
     2^<superscript>32</superscript> is equal to 4GB, so a 32 bit
     processor can load or store to up to 4GB of memory.</para>
     <section>
@@ -320,11 +320,11 @@
       64-bit address space.  Consider a 64-bit address space divided
       into 64 KiB pages creates
       2<superscript>64</superscript>/2<superscript>16</superscript> =
-      2<superscript>52</superscript> pages to be managed; assuming
+      2<superscript>48</superscript> pages to be managed; assuming
       each page requires an 8-byte pointer to a physical location a
       total of
-      2<superscript>52</superscript>*2<superscript>3</superscript> =
-      2<superscript>55</superscript> or 32 PiB of contiguous memory
+      2<superscript>48</superscript>*2<superscript>3</superscript> =
+      2<superscript>51</superscript> or 2 PiB of contiguous memory
       would be required just for the page table!  There are ways to
       split addressing up that avoid this which we will discuss
       later.</para>
@@ -662,7 +662,7 @@
           <title>Other page related faults</title>
         </info>
         <para>There are two other important faults that the TLB can
-	generally generate which help to mange accessed and dirty
+	generally generate which help to manage accessed and dirty
 	pages.  Each page generally contains an attribute in the form
 	of a single bit which flags if the page has been accessed or
 	is dirty.</para>
@@ -695,7 +695,7 @@
 	to manage pages.  The general concept is that a page has two
 	extra bits; the dirty bit and the accessed bit.  When the page
 	is put into the TLB, these bits are set to indicate that the
-	CPU should raise a fault .</para>
+	CPU should not raise a fault .</para>
         <para>When a process tries to reference memory, the hardware
 	does the usual translation process.  However, it also does an
 	extra check to see if the accessed flag is
@@ -1061,7 +1061,7 @@
         resolved by the OS and the processor becomes a software-loaded
         architecture.  However, the performance impact of disabling
         the HPW is so considerable it is very unlikely any benefit
-        could be gained from doing so</para>
+        could be gained from doing so.</para>
         <section xml:id="virtual_linear_pagetable">
           <info>
             <title>Virtual Linear Page-Table</title>
diff --git a/input/chapter06/chapter06.xml b/input/chapter06/chapter06.xml
index 46f8a316..b00c1101 100644
--- a/input/chapter06/chapter06.xml
+++ b/input/chapter06/chapter06.xml
@@ -103,7 +103,7 @@
       <para>The first step of compiling a source file to an executable
     file is converting the code from the high level, human
     understandable language to <emphasis>assembly code</emphasis>.  We
-    know from previous chapters than assembly code works directly with
+    know from previous chapters that assembly code works directly with
     the instructions and registers provided by the processor.</para>
       <para>The compiler is the most complex step of process for a
     number of reasons.  Firstly, humans are very unpredictable and
@@ -416,7 +416,7 @@
           <title>Inlining functions</title>
         </info>
         <para>Similar to unrolling loops, it is possible to put embed
-	called functions within the callee.  The programmer can
+	called functions within the caller.  The programmer can
 	specify that the compiler should try to do this by specifying
 	the function as <computeroutput>inline</computeroutput> in the
 	function definition.  Once again, you may trade code size for
@@ -541,7 +541,7 @@
       with the prefix <computeroutput>extern</computeroutput>.
       <computeroutput>extern</computeroutput> stands for
       <emphasis>external</emphasis> and to a human means that this
-      variable is declared somewhere else.</para>
+      variable is defined somewhere else.</para>
         <para>What <computeroutput>extern</computeroutput> says to a
       compiler is that it should not allocate any space in memory for
       this variable, and leave this symbol in the object code where it
diff --git a/input/chapter07/chapter07.xml b/input/chapter07/chapter07.xml
index a11f4410..b4ffec1d 100644
--- a/input/chapter07/chapter07.xml
+++ b/input/chapter07/chapter07.xml
@@ -12,7 +12,7 @@
     <emphasis>text</emphasis> for historical reasons) and
     <emphasis>data</emphasis>.  We also know, however, an executable
     does not live its life in memory, but spends most of its life as a
-    file on a disk waiting to be loaded an run.  Since a file is, in
+    file on a disk waiting to be loaded a run.  Since a file is, in
     essence, simply a contiguous array of bits, all systems come up
     with methods of organising code and data within files for
     on-demand execution.  This file-format is generally referred to as
@@ -605,13 +605,13 @@
 	(e.g. <computeroutput>program.c</computeroutput>, which
 	<computeroutput>#include</computeroutput>s the header
 	file).</para>
-        <para>We create the library <application>ar</application>
+        <para>We create the library using <application>ar</application>
 	(short for "archive") command.  By convention static library
 	file names are prefixed with
 	<computeroutput>lib</computeroutput> and have the extension
 	<computeroutput>.a</computeroutput>.  The
 	<computeroutput>c</computeroutput> argument tells the program
-	to create the archive, and <computeroutput>a</computeroutput>
+	to create the archive, and <computeroutput>r</computeroutput>
 	tells archive to add the object files specified into the
 	library file.<footnote><para>Archives created with
 	<application>ar</application> pop up in a few different places
@@ -1034,7 +1034,7 @@
 	pointer</emphasis> (gp).  The ABI specifies that r1 should
 	always contain the gp value for a function.  This means that
 	when you call a function, it is the
-	<computeroutput>callees</computeroutput> job to save their gp
+	<computeroutput>callers</computeroutput> job to save their gp
 	value, set r1 to be the new value (from the function
 	descriptor) and <computeroutput>then</computeroutput> call the
 	function.</para>
diff --git a/input/chapter08/chapter08.xml b/input/chapter08/chapter08.xml
index 6bc9dd47..a6362075 100644
--- a/input/chapter08/chapter08.xml
+++ b/input/chapter08/chapter08.xml
@@ -1040,7 +1040,7 @@
 	do out of the ordinary operations.x</para>
         <para>We inspect the symbols with two different tools; in both
 	cases the binding is shown in the second column; the codes
-	should be quite straight forward (are are documented in the
+	should be quite straight forward (all are documented in the
 	tools <application>man</application> page).</para>
         <section>
           <info>
diff --git a/input/chapter08/code/elfrelocs.txt b/input/chapter08/code/elfrelocs.txt
index 2f2c1a0b..b5e33a28 100644
--- a/input/chapter08/code/elfrelocs.txt
+++ b/input/chapter08/code/elfrelocs.txt
@@ -1,7 +1,7 @@
 typedef struct {
   Elf32_Addr    r_offset;  <--- address to fix
   Elf32_Word    r_info;    <--- symbol table pointer and relocation type
-}
+} Elf32_Rel
 
 typedef struct {
   Elf32_Addr    r_offset;