About the octal numbers. Sure, they look weird because "modern" bytes consists of  8-bits, and general microprocessors have 16, 32, or 64-bit words.
But for example, the very notable predecessor of the PDP-11, PDP-8, has 12-bits words.
As for the PDP-11, octal numbers pretty practical for a hacker.
I'll try to demonstrate how nicely machine codes can be split into octets, but first, it is worth to mention that PDP-11 has 
orthogonal instruction set
Base instruction set nicely falls into two patterns, here is a bitwise representation:
instructions with one operand:
Code: Select all
|15 |           |           |         6 | 5       3 | 2       0 |
|   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
|                                       |    MODE   |     Rn    |
|                                       |  **   | * |    ***    |
| OP CODE                               | DESTINATION FIELD     |
* Specifies direct or indirect address
** Specifies how register will be used
*** Specifies one of 8 general perpose registers
and instruction with two operands:
Code: Select all
| 15| 14      12| 11      9 | 8       6 | 5       3 | 2       0 |
| B |   |   |   |   |   |   |   |   |   |   |   |   |   |   |   |
|               |    MODE   |     Rn    |    MODE   |     Rn    |
|               |  **   | * |    ***    |  **   | * |    ***    |
| OP CODE       | SOURCE FIELD          | DESTINATION FIELD     |
* Direct/deferred bit for source and destination address
** Specifies how selected registers are to be used
*** Specifies a general register
And to complete the picture, there is 
8 addressing modes:
Code: Select all
| Code | Name                   | Example    | Description                                                    |
|------+------------------------+------------+----------------------------------------------------------------|
| 0n   | Register               | OPR Rn     | The operand is in Rn                                           |
| 1n   | Register deferred      | OPR (Rn)   | Rn contains the address of the operand                         |
| 2n   | Autoincrement          | OPR (Rn)+  | Rn contains the address of the operand, then increment Rn by 2 |
| 3n   | Autoincrement deferred | OPR @(Rn)+ | Rn contains the address of the operand, then increment Rn by 2 |
| 4n   | Autodecrement          | OPR -(Rn)  | Decrement Rn, then use it as the address                       |
| 5n   | Autodecrement deferred | OPR @-(Rn) | Decrement Rn by 2, then use it as the address of the address   |
| 6n   | Index                  | OPR X(Rn)  | Rn+X is the address of the operand                             |
| 7n   | Index deferred         | OPR @X(Rn) | Rn+X is the address of the address                             |
And there is 
8 registers: R0-R7
Now, lets translate a command 
MOVB (R5)+,R0 into machine code:
Code: Select all
1 - byte operation
1 - MOV op code
2 - autoincrement addressing mode
5 - R5
0 - register addressing mode
0 - R0
MOVB (R5)+,R0 translates to 112500
It's not that hard to learn to read and write machine codes when you use octal numbers  

People who programmed on assembler back in the days, still can write straight in machine codes.