|Don't like to read? you can learn while you
watch and listen instead!
Every Lesson in this series has a matching YOUTUBE video... with commentary and practical examples
Visit the authors Youtube channel, or Click the icons to the right when you see them to watch the Lessons video!
|Welcome To the Dark Side!... I grew
up with the Amstrad CPC, and I started learning Assembly with the
Z80, however as my experience with Z80 assembly grew, I wanted to
start learning about other architctures, and see how they
The 6502, and it's varients powered many of the biggest systems from the 80's and 90'... From the ubiquitous C64... to the Nintendo Entertainment System, as well as the BBC Micro, PC-Engine and Atari Lynx... even the Super Nintendo used a 16 bit varient of the 6502 known as the 65816
The 6502's origins are somewhat odd, a cost reduced version of the 8-bit '6800' (which was the predecessor to the venerable16-bit 68000)... the 6502 sacrificed some functions for a cheaper unit price, which allowed it such wide support... the 6510 which powered the C64 had a few added features...
A later version, the 65C02 added more commands (Used in systems like the Apple IIc and the Atari Lynx) ... and HudsonSoft made a custom version of the 65C02 with even more features, called the HuC6280 and exclusively used in the PC Engine
All these CPU variants are 8 bit, and the basic 6502 command set works in the same way on all these sysems, and it's that instruction set we'll be learning in these tutorials...
These tutorials will be written from the perspective of a Z80 programmer learning 6502, but they will not assume any prior knowledge of Z80, so if you're starting out in assembly, these tutorials will also be fine for you!
In these tutorials we'll start from the absolute basics... and teach you to become a multiplatform 6502 monster!... Let's begin!
The 65C02 die
want to learn 6502 get the Cheatsheet!
it has all the 6502 commands, it also covers the extra commands
used by the 65c02 and PC-Engine HuC6280
|We'll be using
the excellent VASM for our assembly in these tutorials... VASM
is an assembler which supports Z80, 6502, 68000, ARM and many
more, and also supports multiple syntax schemes...
You can get the source and documentation for VASM from the official website HERE
|Lesson 1 - Getting started with 6502|
|Lesson 2 - Addressing modes on the 6502|
|Lesson 3 - Loops and Conditions|
|Lesson 4 - Stacks and Math|
|Lesson 5 - Bits and Shifts|
|Lesson 6 - Defined data, Aligned data... Lookup Tables, Vector Tables, and Self-modifying code!|
|Lesson H1 - Hello World on the BBC Micro!|
|Lesson H2 - Hello World on the C64|
|Lesson H3 - Hello World on the VIC-20|
|Lesson H4 - Hello World on the Atari 800 / 5200|
|Lesson H5 - Hello World on the Apple II|
|Lesson H6 - Hello World on the Atari Lynx|
|Lesson H7 - Hello World on the Nes / Famicom|
|Lesson H8 - Hello World on the SNES / Super Famicom|
|Lesson H9 - Hello World on the PC Engine/TurboGrafx-16 Card|
|Lesson P1 - Bitmap Functions on the BBC|
|Lesson P2 - Bitmap Functions on the Atari 800 / 5200|
|Lesson P3 - Bitmap Functions on the Apple II|
|Lesson P4 - Bitmap Functions on the Atari Lynx|
|Lesson P5 - Bitmap Functions on the PC Engine (TurboGrafx-16)|
|Lesson P6 - Bitmap Functions on the NES / Famicom|
|Lesson P7 - Bitmap Functions on the SNES / Super Famicom|
|Lesson P8 - Bitmap Functions on the VIC-20|
|Lesson P9 - Bitmap Functions on the C64|
|Lesson P10 - Joystick Reading on the BBC|
|Lesson P11 - Joystick Reading on the Atari 800 / 5200|
|Lesson P12 - Joystick Reading on the Apple II|
|Lesson P13 - Joystick Reading on the Atari Lynx|
|Lesson P14 - Joystick Reading on the PC Engine (TurboGrafx-16)|
|Lesson P15 - Joystick Reading on the NES / Famicom and SNES|
|Lesson P16 - Joystick Reading on the VIC-20|
|Lesson P17 - Palette definitions on the BBC|
|Lesson P18 - Palette definitions on the Atari 800 / 5200|
|Lesson P19 - Palette definitions on the Atari Lynx|
|Lesson P20 - Palette definitions on the PC Engine (TurboGrafx-16)|
|Lesson P21 - Palette Definitions on the NES|
|Lesson P22 - Palette Definitions on the SNES / Super Famicom|
|Lesson P22 (z80) - Sound with the SN76489 on the BBC Micro|
|Lesson P23 - Sound on the Atari 800 / 5200|
|Lesson P23 (Z80) - Sound with the 'Beeper' on the Apple II|
|Lesson P24 - Sound on the Atari Lynx|
|Lesson P25 - Sound on the PC Engine (TurboGrafx-16)|
|Lesson P26 - Sound on the NES / Famicom|
|Lesson P27 - Sound on the SNES / Super Famicom: the SPC700|
P28 - Sound on the SNES / Super Famicom: Writing ChibiSound
P29 - Sound on the on the VIC-20
|Lesson P30 - Sound on the C64|
|Lesson P31 - Hardware Sprites on the Atari 800 / 5200|
|Lesson P32 - Hardware sprites on the Atari Lynx|
|Lesson P33 - Hardware Sprites on the PC Engine (TurboGrafx-16)|
|Lesson P34 - Hardware Sprites on the NES / Famicom|
|Lesson P35 - Hardware Sprites on the SNES / Super Famicom|
|Lesson P36 - Hardware Sprites on the C64|
think 64 kilobytes doesn't sound much when a small game now
takes 8 gigabytes, but that's 'cos modern games are sloppy,
inefficient, fat and lazy - like the basement dwelling
losers who wrote them!!!
6502 code is small, fast, and super efficient - with ASM you can do things in 1k that will amaze you!
use a symbol to denote a hexadecimal number, in 6502
programming $ is typically used to denote hex, and # is used to
tell the assembler to tell the assembler something is a number
(rather than an address), so $# is used to tell the assembler a
value is a Hex number
In this tutorial VASM will be used for all assembly, if you use something else, your syntax may be different!
|Digit Value (D)||128||64||32||16||8||4||2||1|
|Our number (N)||1||1||0||0||1||1||0||0|
|D x N||128||64||0||0||8||4||0||0|
|128+64+8+4= 204 So %11001100 = 204 !|
|If you ever get confused, look at Windows
Calculator, Switch to 'Programmer Mode' and it has binary
and Hexadecimal view, so you can change numbers from one form to
If you're an Excel fan, Look up the functions DEC2BIN and DEC2HEX... Excel has all the commands to you need to convert one thing to the other!
|Equivalent Byte value||255||254||253||251||246||236||206||2||1|
|Equivalent Hex Byte Value||FF||FE||FD||FB||F6||EC||CE||2||1|
|All these number types can be confusing,
but don't worry! Your Assembler will do the work for you!
You can type %11111111 , &FF , 255 or -1 ... but the assembler knows these are all the same thing! Type whatever you prefer in your ode and the assembler will work out what that means and put the right data in the compiled code!
|Compared to the Z80, two things are apparent about the 6502...
firstly the stack pointer is only 8 bit... and secondly we have
very few registers!
The way the Stack pointer works is simple... the stack is always positioned beween $0100 and $01FF... Where xx is the SP register, the stack pointer will point to $01xx
The 'solution' to the lack of registers is special addressing options... the first 256 bytes between &0000 and &00FF are called the 'Zero Page', and the 6502 has many special functions which allow data in this memory range to be quickly used with the accumulator and other functions as if they were 'registers'!
Note: the PC-Engine has different Zeropage and Stackpointer addresses... and the 65816 can relocate them!... in this case the Zeropage (ZP) is often referred to as the Direct page (DP)
|Mode||Description||Sample Command||Z80 Equivalent||effective result|
|Implied / Inherant||A command that needs no paprameters||SEC||SEC (set carry)||SCF|
|Relative||A command which uses the program counter PC with and offset nn (-128 to +127)||BEQ #$nn||BEQ [label] (branch if equal)||JR Z,[label]|
|Accumulator||A command which uses the Accumulator as the parameter||ROL||ROL (ROtate bits Left)||RLCA|
|Immediate||A command which takes a byte nn as a parameter||ADC #$nn||ADC #1||ADC 1||&nn|
|Absolute||Take a parameter from a two byte memory address $nnnn||LDA $nnnn||LDA $2000||LD a,(&2000)||(&nnnn)|
|Absolute Indexed||Take a parameter from a two byte memory address $nnnn+X (or Y)||LDA $nnnn,X||LDA $2000,X||(&nnnn+X)|
|Zero Page||Take a parameter from the zero page address $00nn||ADC $nn||ADC $32||(&00nn)|
|Zero Page Indexed||Takes a parameter from memory address $00nn+X||ADC $nn,X||ADC $32,X||(&00nn+X)|
|Indirect||Take a parameter from pointer at address $nnnn...
if $nnnn contains $1234 the parameter would come from the address at $1234
|JMP ($1000)||LD HL,(&1000)
|indirect ZP||The 65c02 has an extra feature, where it can read from an unindexed Zero page||LDA ($80)||((&00nn))|
|Pre Indexed (Indirect,X)||Take a paramenter from pointer at address $nnnn+X
if $nnnn contains $1234, and X contained 4 the parameter would come from the address at $1238
|ADC ($nn,X)||ADC ($32,X)||((&00nn+X))|
|Postindexed (Indirect),Y||Take pointer from address $nnnn, add Y... get the parameter from
if $nnnn contains $1234, and Y contained 4, the address would be read from $1234... then 4 would be added... and the parameter would be read from ther resulting address
|ADC ($nn),Y||ADC ($32),Y||((&00nn)+Y)|
|Basic command||Comparison||6502 command||Z80 equivalent||68000 equivalent|
|if Val1>=Val2 then goto label||>=||BCS label||JP NC,label||BGE label|
|if Val1<Val2 then goto label||<||BCC label||JP C,label||BLT label|
|if Val1=Val2 then goto label||=||BEQ label||JP Z,label||BEQ label|
|if Val1<>Val2 then goto label||<>||BNE label||JP NZ,label||BNE label|
|12345||(16384)||decimal memory address|
|$||$4000||(&4000)||Hexadecimal memory address|
|After a BIT command||BPL/BMI Dest||BVS/BVC Dest|
|Missing command||Meaning||6502 alternative|
|ADD #5||ADD a number without carry||CLC (Clear carry for add)
ADC #5 (Clear carry)
|SUB #5||Subtract a number without carry||SEC (Clear carry for sub)
SBC #5 (Clear carry)
|NEG||convert positive value in Accumulator to negative value in Accumulator||EOR #255 (XOR/Flip bits)
CLC (Clear carry)
ADC #1 (add 1)
|SWAP A||Swap two Nibbles in A||ASL (shift left - bottom bit
ADC #$80 (pop top bit off)
ROL (shift carry in)
ASL (shift left - bottom bit zero)
ADC #$80 (pop top bit off)
ROL (shift carry in)
|RLCA||Rotate left with wrap||CLC (Clear the carry)
ADC #$80 (pop top bit off)
ROL(shift carry in)
|RRCA||Rotate right with wrap||PHA (Backup A)
ROR (Rotate Ritght - get bit)
PLA (Restore A)
ROR (Rotate Ritght - set bit)
|BRA r||Jump to PC relative location +r
(Use instead of JMP for relocatable code)
|CLV Clear Overflow
BVC n Branch if overflow clear
|CALL NZ,subroutine||Skip over subroutine command if Zero||BEQ 3 Skip the JSR command
JSR subroutine Csubroutine to call if nonzero
|RET Z||Skip over return command if Zero||BNE #1 Skip the RET command
RTS Return if zero
|PHX / PHY||Push X (PHX does exist on 65c02)
(do opposite for PLX)
|HALT||infinite loop until next Interrupt||CLV
|LDA (zp)||Load a from the address in (zp)
(not needed on 65c02... use LDA (00zp)
(do same for STA etc)
|Using Zeropage||Using X (takes 7 more bytes)|
| JSR TestSub
db $11,$22,$33 ;Parameters
ADC #3+1 ;(parameter bytes+1... so 3+1)
| JSR TestSub
db $11,$22,$33 ;Parameters
ADC #3 ;(parameter bytes... so 3)
BCC 3 ;Skip over inc command (3 byte cmd)
|INC (inc de)||DEC (dec de)||ADD (add bc to hl)||SUB|
|Negative||Overflow||Unused||Break||Decimal mode||Interrupt state||Zero||Carry|
|lookup 16 bit value A in [table]|
| ASL A
16 bit value is now in destval
| ASL A
(because RET adds 1 to address - you must subtract 1 from pointers in table)
| Lesson 1 - Getting started with 6502
I Learned Assembly on the Z80 systems, and the 6502 seemed strange and scary!... but there's really nothing to worry about, while you have to use it a little bit differently, programming 6502 is no harder than Z80!
Lets start from the basics and learn how to use 6502!
|In these tutorials, we'll be using
VASM for our assembly, VASM is free, open source and supports
6502,Z80 and 68000!
We will be testing on various 6502 systems, and you may need to do extra steps (such as adding a header or checksum)... if you download my DevTools, batch files are provided to create the resulting files tested on the emulators used in these tutorials.
|Platform||Symbol Definition Required||Emulator used|
|Apple IIe||BuildAP2 equ 1||AppleWin|
|Atari 5200||BuildA52 equ 1||Jum52|
|Atari 800||BuildA80 equ1||Atari800win|
|BBC Micro B||BuildBBC equ1||BeebEm|
|Atari Lynx||BuildLNX equ 1||Handy|
|Nintendo NES/Famicom||BuildNES equ 1||Nestopia|
|PC Engine||BuildPCE equ 1||Ootake|
|Super Nintendo (SNES)||BuildSNS equ 1||Snes9x|
|Vic 20||BuildVIC equ 1||Vice|
|For these tutorials, I have
provided a basic set of include files that will allow us to look
at the technicalities of each platform and just worry about the
workings of 6502 for now...
We will look at ALL of this code later, in the Platform specific series... but we can't do that until we understand 6502 itself!
The example shown to the right will load the A register with $69 (69 in hexadecimal)
We will then call the 'Monitor' function - which will show the state of the CPU registers to screen!
in this way, whatever the 6502 system you're learning and what emulator you're using, we'll be able to do things in a common way!
The example to the right is split into 3 parts:
The generic header - this will set up the system to a text screen
The program - this is where we do our work
The generic footer - The functions and resources needed for the example to work
It's important to notice all the commands are inset by one tab... otherwise the Assembler will interpret them as labels.
scripts provided with these tutorials will allow us to just look
at the commands for the time being... we'll look at the contents
of the Header+Footer in another series...
Of course if you want to do everything yourself that's cool... We're lerning the fundamentals of the 6502 - and they will work on any system with that processor... but you'll need to have some other kind of debugger/monitor or other way to view the results of the commands if you're going it alone!... Good luck!
|The 6502 has 3 main registers...
A is known as the Accumulator - we use it for all our maths
X and Y are our other 2 registers... we can use them as loop counters, temporary stores, and for special address modes... but we'll look at that later!
Lets learn our first commands... LDA stands for LoaD A... it sets A to a value... we can also do LDX or LDY to load X or Y registers!
Take a look at the example to the right... we're going to load A, X and Y... but notice... we're going to load them in different ways... A will be loaded with #$69... X will be loaded with #69... and Y will be loaded with 69... what will the difference be??
|Well here's the result... the values are shown in Hex...
so A=69... because specifying #$69 tells the assembler to use a HEX VALUE
but X=45... this is because without the $ the assembler used a Decimal value (45 hex = 69 decimal)
Y=0... why? well when we don't use a # the assembler gets the memory address.... so we read from memory address decimal 00069!... of course we can do $69 or $0069 to read from address hex 0069 too!
So #$xx = hex value .... #xx = decimal value.... and xx means read from address!
|If you forget
the # you're code is going to malfunction - as the assembler
will use an address rather than a fixed value!
It's an easy mistake to make, and it'll mean your code won't work... so make sure you ALWAYS put a # at the start of fixed values!... or you WILL regret it!
|12345||(16384)||decimal memory address|
|$||$4000||(&4000)||Hexadecimal memory address|
|We've been using this JSR
command... but what does it do?
Well JSR jumps to a subroutine... in this case JSR monitor will run the 'monitor' debugging subroutine... when the subroutine is done, the processor runs the next command
In this case that command is 'JMP *' which tricks the 6502 into an infinite loop!
JSR in 6502 is the equivalent of GOSUB in basic or CALL in z80.... we'll look at how to make our own subroutine in a later lesson!
|JMP is a jump
command ... and * is a special command that means 'the current
line' to the assembler... so 'JMP *' means jump to this line...
This causes the 6502 to jump back to the start of the line... so it ends up running the jump command forever!... it's an easy way to stop the program for testing!
|In this example, we're going to set A to Hex 15... then we'll
show it by calling the Monitor
then we'll add 1... and show it again with the monitor
then we'll subtract 1... and show it again with the monitor
We don't want the Carry affecting things so we have to CLear the Carry with CLC before the ADC command...
However strangely if we don't want the Carry to affect subtraction, we have to SEt the Carry with SEC... before the SBC command - this is the opposite of the z80 command, but it's just the way the 6502 does things!
|Here is the result... you can see we go from 15, to 16, then back to 15!|
|We know how to set all the registers, but what if we have a
value in one register, and we want to transfer it to another...
Well, we can use TAX and TAY to Transfer A to X...or Transfer A to Y!
We can also use TXA or TYA to Transfer X to A... or Transfer Y to A!
What if we want to transfer X to Y? (or Y to X) ... well we can't directly, so we'd have to do TXA... then TAY
|You can see the result here... First we set A to $25 and Y to
$34 - the result is shown on the first line
Then we transfer A to X... and Y to A... the result is shown on the second line.
|Remember we learned that using LDA
with a number without a # means it will load from that numbered
address? - so LDA $13 will LoaD A from hex address $0013?
Well we can also STore A with the STA command!... we can also STore X with STX, or Store Y with STY!
In this example we'll use STA to store some values to memory addresses $0011 and $0012
We'll then set the Accumulator to $13 and add these two memory addresses to the accumulator.... finally we'll use STA again to store the result to memory address $0013
When it comes to showing the result, we'll use another debugging subroutine I wrote called MemDump... this will dump a few lines of data to the screen... in this case we'll show 3 lines (of 8 bytes) from memory address $0000-$0018... In this example, we'll show the memory before, and after we do the writes.
* Warning * If you're not using my sample code, these commands may overwrite system variables - and cause something strange to happen!
|Here's the result of the programm
running... you can see the bytes $11, $22 and $66 were written...
these are the two values stored at the start... and then the
result of these two added to the $33 loaded into the accumulator
Want to try something else?? Why not change CLC to SEC and ADC to SBC... and see what happens!
|The first 256 bytes of memory $0000-$00FF
are special on the 6502... in fact there's a lot we're not
mentioning about reading and writing memory... but it's coming
Also the memory from $0100-$01FF is also special... it's used by the stack!... don't know what that is? don't worry... we'll come to that!
Careful writing to memory on different systems... This
example may not work write on some systems...
The PC-Engine is weird... unlike every 6502... the range $0000-$01FF is NOT memory... that area is at $2000-$21FF
Why? because it's not actually a 6502... its a HuC6280... it's almost the same as a 6502... but it has some extras and weirdness!
| Lesson 2 - Addressing modes on the 6502
The 6502 has very few registers - but it makes up for this with a mind boggling number of addressing modes!
You won't need them all at first, but you should at least understand what they all do - lets see some examples of how they work!
Lets try them all out with some simple examples!
|In order to run these examples
we're going to need to set up some areas of memory, by filling
them with test values.
The code to the right will do the work (via a Function called LDIR - which copies memory areas)... don't worry how it works for now, it's too complex at this time!
|Here is the rest of the Chunk copying code, and the data copied... again, you don't need to worry about this for now.|
|Here is the important bit... THIS is the data as it appears in memory when the program runs... you may want to refer back to this if you wish!|
tutorials will not work on all systems... for example most will
not work on the PC engine, because the zero page is not at
They may also not work on the NES or SNES, because the &2000 area has a special purpose on those systems.
They have all been tested on the BBC.... but don't worry... the theory shown here is based on the principals of the 6502 - so will work on ANY 6502 based system!
|We're all set up now... lets try
out all the addressing options... we'll look at the theory, and an
example program... then we'll see the result in the registers in a
screenshot from the BBC version
We'll be reading in all these examples... but many of the commands can be used for other commands.. please see the Cheatsheet for more details.
|Relative Addressing is where execution (the program counter)
jumps to a position relative to the current address - it can be
127 bytes after the calling line, or 128 bytes before....
This means the code will be 'relocatable' - we can move it in memory and it will still work, but we can't jump more than 128 bytes!
There are all kinds of 'Branch' commands... here we've used 'Branch if Carry Clear'... we'll look at the others in a later lesson
BCC ALWAYS takes a fixed number (not an address), so we don't have to use # with BCC in vasm!... that said, we can just use labels (names that appear at the far left, and let the assembler work out the maths.
|Take a look at the example to the
right... there are 3 Monitor commands... but only 2 show on the
screen... this is because the BCC skips over one
The "Program Counter" (shown as P) stores the byte of the end of the last command.... A "JSR Monitor" takes 3 bytes, "BCC 3" takes 2... hopefully the numbers the program counter shows will now make sense if you add up the commands!
|Accumulator addressing sounds more complex than it is!
Effectively it's a command with no parameters - it just changes the accumulator in some way....
|For Example LSR shifts the bits to
the left... don't worry if you don't understand it, we'll look at
|Again, Immediate sound scary... but it's really easy... it's
just a simple number in the code, specified with a #
As we've already learned... we can use # followed by $ to sepcify a hexadecimal number.
|In this example we will add Hex 10
and Hex 20... the result is obviously 30!
Why not try using different numbers,remove the $ to stop using hexadecimal..., or SBC... don't forget to change CLC to SEC if you do!
|The Zero Page is the 6502's special trick... addresses between $0000 and $00FF
are called the 'Zero Page'... these can be stored as a single
byte... so $FF would refer to address $00FF
Because the address is stored as a single byte - it's fast, and the Zero page can do things that other addresses cannot!
The 6502 uses this 'zero page' like a bank of 255 registers - allowing the 6502 with it's just 3 registers to do the things the Z80 did with over a dozen!
|In this example we'll load from
zero page address $80.... note that if we did LDA #$80 then we
would load the Value $80 not from the address...
This is important - you don't want to make that mistake (too often!)
|The Zero Page
(Sometimes called the Direct Page - usually when it's not at
$0000) is effectively the 'tepmporary store' for all the data we
can't get into the A,X and Y registers...
We can use different numbered addresses for different purposes, but many may be used by the machines firmware!
|When we specify ,X or ,Y after an address it becomes an
offset... the register is added to the address in the zero page...
and the value is retrieved from the resulting address...
Note - you typically have to use X for this addressing mode... however LDX and STX are as special case, and we can use Y because we can't use X if it's the source or destination of the command
Note... LDA $20,Y is not a valid command... however the assembler will covert it to LDA $0020,Y which IS... but it takes an extra byte, so is not as efficient!
|As you can see here we're using the
Zero Page, and X and Y register....
take a look at the values we wrote to the Zero Page at the start, and try changing X,Y and the source location ($80) to other values.
|Of course we can't always read and write in the zero page... we'll want to specify the whole address... this takes an extra byte - so the command will be 3 bytes total and is slower, but we can get data from the whole 64k range ($0000-$FFFF)|
|Absolute addressing is good for variables we're not storing in the zero page (often most of the Zero page is used by the firmware!)... but isn't very good for reading in lots of data (like sprite images)... for that we want indirect addressing - which we'll look at soon!|
|When we want to read from multiple
addresses, we can used Indexed addressing... this adds X or Y to
an address - so we can change X/Y to read in from a range using a
Loop!... we'll learn how to do a loop very soon!
$xxxx,Y can be used with many commands, but $xxxx,X has more options... check out the cheatsheet for more info!
|Changing X and Y allow you to change the source address without changing the LDA line.... we'll learn how to do this in loops and functions later.|
|We can directly read a 16-bit value from another 16-bit address
($0000-$FFFF) In one special... the JuMP command (for all other
cases we need to use the zero page.
This can be used to reprogram parts of your progam - allowing alternate routines to be 'switched' in.
|In this example we use ($2000)... this loads in two bytes $1B1A
and then jumps to that address (sets the PC to 1B1A)...
Our setup put a "JSR MONITOR" at this address... so we see the contents of the registers... notice P (the program counter) is $1B1C... the last byte of the 3 byte "JSR MONITOR" command
|Pre-inxexed Indirect with X regsiter uses the ZeroPage... X is
added to the ZeroPage.... the two consecutive bytes are read in
from the zero page, and these are used as an address... a byte is
read from that address... Note... the data is stored in 'Little
Endian' format... meaning the lower value byte comes first
This is all very cofusing!... but think of it like this... two bytes of the zeropage are a 'temporary address' pointing to the actual data we will read
We can use these to simulate 'Z80 registers'... by setting one as an L register for the low byte, and the next as the H register for the high byte....
This is how we get around the 6502's lack of registers!... don't worry about it if you don't understand yet... we'll see this a lot later!
|In this example we've got X set to 1... so we end up loading a
byte from the address made up of bytes at $0081 and $0082 -
remember they are in reverse order because it's little endian!
we then show the result to screen.... of course setting X to 0... and changing $80 to $81 would have the same effect.
|Post-Indexed with the Y register also use the Zero Page... two
concecutive bytes are read in from the Zero page to make an
address... but the Y register is then added to THAT address... and
the final value is read from the resulting address.
With this option, Effectively, if we store an address in the Zero page... we can use Y as a counter and read from consecutive addresses... we can use this in a loop - we'll learn how to do that later
|Y is 2 in this example, so 2 is added to the address in ZeroPage ($0080-$0081)... if we change Y then the final address will change by the same amount|
|This is a special mode only available on 65c02 used by the Lynx,
Snes, PcEngine and Apple II....
Effectively it's the same as Preindexed when X=0... or PostIndexed when Y=0... this is how we can simulate this addressing mode if we need to do this on the other machines!
It uses a pair of bytes in the Zero page as an address, and uses that address for the result
|It would be nice to have this mode on the other CPU's, but we
don't... however we can simulate it!
to fake it on other machines we set X=0 then use LDA ($81,X)
or we set Y=0 and then use LDA ($81),Y
see much '65c02 only' code in these tutorials - so all the code
will work on all systems, we only use the basic 6502 commands
Of course you're free to use them if you wish, just remember - it will mean you can't port your code to another system as easily!
| Lesson 3 - Loops and Conditions
We've had a breif introduction to 6502, and now we understand the Addressing modes we can look properly at 6502, lets take a look at some more commands, an how to do 'IF Then' type condions and Loops!
|We've been cheating a little, we've overlooked a few important
commands - they're hidden in the header, but we really need to
know them!... before we start the proper lesson, lets look at them
We're going to need to know ALL the details of assembly to create a working program, and something have been hidden until now! but we need to ensure we know everything.
|Because we're compiling to a 8-bit cpu with a 16-bit address
bus, our compiled code filles maps to a fixed address within the
memory space... this is important, because while branch commands
like BCC are an 'offset'... JMP commands will 'Jump' to a specific
to the right, you can see how the code will compile - this is the 'Listing.txt' file, showing the source code and the resulting binary output.
The SEI command is compiled to the byte $78 - this is the command as the CPU sees it... because of the ORG command, the code is compiled to the address $0200...
|We also have a Label...
must be at the far left of the screen... all other commands must
n this example, the label will be defined as address $0200 - so if we use it in a Jump command (hex $4C) , it will be compiled to that address (in reverse endian - so $0200 becomes $00 $02)
|Interrupts are where the CPU does other tasks whenever it wants!
For simplicity at this stage, we want to stop that, so we use SEI to "Set the Interrupt Mask"
Don't worry about interrupts yet, we'll look at them later... so for now we just need to know how to turn them off
|Symbols are similar to labels... they allow us to give 'name'
(like TestSym) a 'Value' ... rather than using the value later, we
can just use the symbol... Using symbols makes it easy for us to
program, as we can use explainatory text rather than meaningless
the assembler will convert the symbol name to its original value... we just use EQU to define the definition... in the example once assembled LDA converts to byte $A5... and TestSym has a value of $69
In VASM, like labels, symbol definitions must be at the far left of the screen
|There will be frequent times when
we need to increase and decrease values by just 1
For the X or Y registers we can do this with INX and DEX
We can increase values in the ZeroPage by using INC $01 or DEC $01
rather annoyingly there is no INC or DEC command on the 6502... so we have to simulate it, by clearing the carry, and adding one (CLC, ADC #1)
|Here you can see the results of the program...
The first thee lines show the status of the registers at each stage.... and we can see how A,X and Y are affected by each stage of the program
The lower half shows the zero page - and we can see how $01 goes up and down as we do INC and DEC commands
|Branches allow us to do things depending on a condition... we
can use this to create a loop!
Because we don't have a DEC command for the accumulator, it's often easier to use X or Y as a loop counter.
if we use DEX to decrement the counter, and BNE will jump back until the counter reaches zero... note that BNE needs to be immediately after the decrement command as other commands may alter the Z flag
|BCC||Branch if Carry Clear||flag C=1||Is there any carry caused by last command?*|
|BCS||Branch if Carry Set||flag C=0||Is there any carry caused by last command?*|
|BEQ||Branch if Equal||flag Z=1||Is the result of the last command zero?|
|BMI||Branch if Minus||flag S=1||Is the result of the last command <128|
|BNE||Branch if Not Equal||flag Z=0||Is the result of the last command zero?|
|BPL||Branch if Plus||flag S=0||Is the result of the last command >=128|
|BVC||Branch if Overflow Clear||flag V=0||Is there any overflow caused by there last command?*|
|BVS||Branch if Overflow Set||flag V=1||Is there any overflow caused by there last command?*|
|Basic command||Comparison||6502 command||Z80 equivalent||68000 equivalent|
|if Val2>=Val1 then goto label||>=||BCS label||JP NC,label||BGE label|
|if Val2<Val1 then goto label||<||BCC label||JP C,label||BLT label|
|if Val2=Val1 then goto label||=||BEQ label||JP Z,label||BEQ label|
|if Val2<>Val1 then goto label||<>||BNE label||JP NZ,label||BNE label|
|Branch commands are pretty limited, they can only jump 128 bytes away, if you try to jump further you will get an error|
|If you need to jump further, or you
want to use JSR with a condition you have to do things
backwards!.... jump OVER the JSR or JMP command if the condition
is NOT met
For example... if you want to call the Monitor if X=2... then you have to use a branch command to jump OVER the call if X is not 2...
|The result is that the monitor is called only when X=2... we've faked a 'Jump to SubRoutine on Equal' command... we can also do the same with a JMP to get further than 128 bytes away!|
|JMP jumps to a specific memory address, where as BEQ and other
branch commands jump to a relative position...
There may be cases where you want to write code that can be relocated... copied to a new memory address and still executable... JMP will not work in this case, but branch will...
the 65c02 has a BRA command for this purpose (branch always)... but the 6502 does not... we can however simulate it by clearing the rarely used overflow with CLV, then using BVC
Don't worry if you don't see any reason to do this - you may never need to! if you don't know why you'd need relocatable code - then you don't need it!
|It's important to understand that
ALL other languages convert to assembly... so anything Basic or
C++ can do can be done in ASM!
We can chain multiple branches together to create 'If Then ElseIf' commands or even create 'Case' Statements in assembly, just by chaining multiple branch commands together.
|The result will be the program will branch out to each of the subsections depending on X|
|Through a combination of
conditions we can do any condition in assembly that C++ or Basic
can do... that's because those languages compile DOWN to
That said, it will take a lot more work in assembly!
| Lesson 4 - Stacks and Math
Now we know how to do conditions, jumping and the other basics, it's time to look at some more advanced commands and principles of Assembly..
Lets take a look!
|'Stacks' in assembly are like an
'In tray' for temporary storage...
Imagine we have an In-Tray... we can put items in it, but only ever take the top item off... we can store lots of paper - but have to take it off in the same order we put it on!... this is what a stack does!
If we want to temporarily store a register - we can put it's value on the top of the stack... but we have to take them off in the same order...
The stack will appear in memory, and the stack pointer goes DOWN with each push on the stack... so if it starts at $01FF and we push 1 byte, it will point to $01FE
|on the Z80 we have Push and Pop,
but on the 6502 it's Push and Pull!
We PUSH values onto the top of the stack to back them up, and PULL them off!
Our 6502 has 4 registers we may want put onto the stack A, X, Y and the 'Flags' ... unfortunately the basic 6502 can only directly do A and the Flags - so we will have to Transfer X/Y to A first ... but the 65C0C can do it directly.
When it comes to setting the 'Stack pointer' we have to do it via the X register - Remember, the stack HAS to be between $0100 and $01FF on the 6502
|Let's try out the stack!
We're going to set A,X and Y to various values, and push them onto the stack,
Because we can't do this directly for X and Y, we'll have to transfer them to A first
Once we've done that, we'll show the contents of the stack...
We'll then clear all the registers - and pull them from the stack - it's important we pull them in the same order!
Finally we'll show all the register contents
|We can see the
3 bytes at the top of the stack - remember the stack
pointer goes down with each push, so they are backwards
Provided we restore them in the correct order - the registers are restored - even though we cleared them before
|Subroutines are sections of code
that will be executed, and then execution will resume after they
On the 6502 we call a sub with JSR (Jump SubRoutine).... and the last command of the sub is RTS (ReTurn from Subroutine)
if you're familiar with basic JSR is the equivalent of GOSUB... and RTS is the equivalent of RETURN
We're going to do a test here... we'll show the stack to the screen... first we'll push the flags onto the stack,
Then we're going to use JSR to jumpt to subroutine StackTest.... we'll show the stack again... and for reference, we'll also see the address of 'ReturnPos'
Then we'll return to the main program and show the stack again... what will happen?
|The flags are pushed onto the stack first... Next we can see the
'Return address' , that was pushed onto the stack by the JSR
Effectively JSR pushes the program counter onto the stack, and RTS pulls the Program Counter off the stack
|Because the JSR and RTS commands use the stack to maintain the program counter, it's important that the stack is the same when a subroutine ends as it was when it starterd... ne need to ensure we pull everything off the stack that we pushed on at the start... otherwise some 'other data' will be mistaken for the return address - and anything could happen!|
|Negative numbers in HEX are
weird!... when we subtract 1 from 0 we get 255... this means 255
IS -1... in the same way, 254 is -2 and so on - meaning a 'Signed'
byte can go from -128 to +127
The CPU doesn't 'Know ' whether it's working with signed or unsigned numbers - it all depends how we use the data...
The psuedocode for converting to positive to negative is to invert all the bits, and add one... or subtract the value from zero of course!
|When we put a #-1 in the source,
its converted to 255...
Because the numbers wrap around, adding 255 to a number decreases it by 1... so 255 IS -1
if we want to negate a number, we flip all the bits and add one... this converts 01 to $FF
|We learned about using Labels for Jumps, and Symbols for values
before... but symbols have another use!
We can put IFDEF statements in our code, and have parts of the assembly only compile if a symbol is defined - or not defined with IFNDEF
It's important to understand, it's not the CPU doing ths, the assembler simply skips over the excluded code - so it never appears in the outputted binary!
This allows us to build multiple versions of a program from a single source, in fact it's how these tutorials support so many systems!
To disable a definition we can just rem it out with a semicolon ; - we can even define symbols on the Vasm Command line!
|The output will of course be completely different depending on whether TestSymbol is defined or not.||With TestSymbol Defined
Without TestSymbol Defined
|Subroutines are great - but there's
times they may be too slow (because of the JSR/RTS) .... and if
you want to do things with the stack, they may not be possible.
Alternatively, we can use a Macro... this is a chunk of code that we can give a simple name... then whenever we use that name - the assembler will insert the code... we can even use parameters in the macro.
Because the assembler does the work, it's faster than a call, but saves us typing all the commands... however it will make the code larger - so you will want to call to subroutines for big chunks of code where you can rather than use macros.
|Unlike the Z80, we don't have pairs
of registers which we can use for 16 bit commands,
the easiest solution to this is to use concecutive bytes of the Zero Page as a pair to make up a 16 bit 'Zero Page Register'
For ease of use, we'll use Symbols to define these with a name - and we'll mimic the Z80 register pairs... for example HL is High Low... but because the 6502 is little endian, L comes first in the zero page
|When it comes to Addition or
Subtraction - we use the Carry flag...
The Carry flag stores the 'overflow' of an addition, or the 'borrow' of a subtraction.
By using two ADC we can add 16 bit (or more) numbers, and two SBC's can do a 16 bit subtract
|When we want to use a 16 bit value,
we have to split it into it's High byte, and it's Low byte
Forunately 6502 assemblers have us covered... we can use a > to calculate the high byte of a number, and < to calculate the low byte
Once we've set 16 bit pairs Z_DE and Z_HL, we can call the addition or subtraction function
Note: many of the 'Printchar' functions use the same 'Z Page' values... so we're using a special 'PrintHex' function that backs them up.
needs to stop at 16 bits, you can just keep doing ADC's to get
up to 32 bits or more...
Of course it will be slower!... another option is 'floating point'... but that's a too complex to cover here!
tutorials use Zero page registers to mimic the function of Z80
registers where the 6502 can't directly do the job... this is
because the author of these tutorials started on the Z80, and
found that the most logical way to do things...
Other Tutorials may do things differenty, and if you don't like this way of using the Zero page, you should probably follow another tutorial instead.
|The Z80 and 6502 have something in
common... they have no Multiply or Divide commands... yes, you
read that right!
We can, however simulate them!... the simplest way to multiply is repeately add a value, or subtract one to divide...
There are faster ways of doing things - and we'll look at them later!
In our Multiply example we'll multiply A by X, and store the result in A
In our Divide example we'll Divide A by X, and store the successfull divisions in X, and the remainder in A
|You can see we've effected a simple Multiply and Divide command!|
| Lesson 5 - Bits and Shifts
We've learned lots of maths commands, but we've still not covered the full range... this time lets take a look at how we can work with Bits on the 6502!
|Meaning||Invert the bits where the
mask bits are 1
|return 1 where both bits are1||Return 1 when either bit is 1|
Invert Bits that are 1
Keep Bits that are 1
Set Bits that are 1
|Lets try these commands on the
We'll use a test bit pattern, and try each command with the same %11110000 parameter,
We're using a 'MontiorBits' function, which will show the contents of the Accumulators bits to screen!
|The bits of the test pattern will be altered in each case according to the logical command!|
|Arithmatic Shift /
|We're going to test the shifting
commands... we'll use a new testing function 'MonitorBitsC' will
show the Accumulator and Carry flag.
We'll set the accumulator to %10111000, and we'll clear the carry flag...
Then we'll see what happens when we use each of the rotate commands 9 times!
|So what does each command do?
Well ROL rotates all the bits Left, the carry ends up in Bit 0 - and what WAS in Bit 7 ends up in the carry.
ROR is the opposite... it rotates all the bits Right, the carry ends up in Bit 7 - and what WAS in Bit 0 ends up in the carry.
ASL shifts all the bits left - but Bit 0 is zero - and the what was in Bit 7 is lost
LSR is the opposite, it shifts all the bits right - but Bit 7 is zero - and the what was in Bit 0 is lost
|The 6502 doesn't have as many bit
shift options as the Z80... but we can 'fake' others!.
If we want to shift 1's into the empty bits we can just set the carry with SEC before the rotate command,
If we want to rotate the 8 bits in the accumulator without the carry... we can back up A with PHA, do the rotate, then restore A with PLA, and do another rotate
|Now we're able to set the new bits
to a 1, or able to rotate the bits within A
There's other ways to do this, and other combinations of commands to do things like swap nibbles... see here
lots of commands we'd like to have that are 'missing' on the
6502 - and this is just one possible solution
See Here for more examples of combinations of commands to effect the result you want.
|Because the BIT command needs to work with an address, we need
to define some bitmasks...
To define a byte of data in our program code we use DB - then we specify the value for the byte... we're using % and defining the definitions in bits
We're giving each of these a label, so we can use them easilly later.
|We can use the BIT command with a label pointing to one of these
defined bytes, and then use BNE or BEQ to branch depending on if
the bit was Zero or not...
Note, the Accumulator is unchanged when we do this
|We'll branch and show a B if the bit is Zero... or an A if the
bit is One
Hint: Try changing the TBit1 to a TBit0 in the example code!
|Specifying Addresses in this way
will use 3 bytes per command - which is wasteful - if possible, it
would be better to store these bitmasks in the Zero page, so we
only use 2 bytes per command if we can.
you test, two other flags are set at the same time....as well as
the Z flag being set to the tested bit, N flag is set to bit 7 ,
and the V flag is set to bit 6
So you can branch on conditions relating to bit 7 and 6 without any more testing commands!
|NOP (No OPeration) is a
strange command... it does absolutely nothing!
Why would we want to use it? well it's handy for a short delay - and if we do something called 'Self Modifying code' (code that rewrites itself) it can be useful for disabling commands
|The more NOPs we add, the slower the screen will fill|
|Lots of NOP
commands aren't really a good way of slowing things down - It's
far better to nest loops to slow things down or use some kind of
NOP's are more useful for self modifying code - we'll learn about that next time!
| Lesson 6 - Defined data, Aligned data... Lookup
Tables, Vector Tables, and Self-modifying code!
Now we've learned all the basic maths commands, it's time to start looking at some clever tricks!
|There will be times we need to define data for use within our
code areas... we can use three commands to do this...
DB will define one or more bytes
DW will define one or more words (in little endian)
DS will define sequences of defined length in bytes - if only one parameter is specified, then all the bytes are zero, if two are specified they will all be the specified value
|The contents of the defined bytes will be shown... notice that the bytes with DW are backwards, because|
and DS are assembler commands not 6502 opcodes... they will work
in VASM and other assemblers, but depending on your assembler
the commands may be different.
Check your documentation if the commands do not work as you expect!
|A Lookup table is just a set of data for some purpose, we can
lookup a numbered entry and use the result for some purpose...
For Example, if we want to draw a sine wave, but don't want to try to calculate a sine wave, we can just read the needed values from a 'Lookup Table'
|We're going to use this lookup
table to set an X position, and repeatedly decrement the Y - so we
can draw a sinewave in X'es
The 6502's Indexed addressing mode is perfect for this kind of work!
We LDA sine,X to read in entry X from the sine lookup table!
Note... the Lookuptable has values 0-255 - we need to scale it down by dividing it by 16 - we do that with 4 LSR's
|Our sine wave will be shown to screen... it's not very high resolution, but we could add extra steps if we wanted.|
|The entries in a lookup
table don't have to just be 1 byte it can be as many bytes as you
want - though if you use X to read in the entries ... your total
lookup table has to be 256 bytes in total, so if each entry is 4
bytes (2 words), then the Lookuptable can only have 64 entries!
You can always calculate the address to read from manually rather than using X if you need more
|One special kind of Lookup
Table is sometimes called a 'Vector table'...
This is a table of 16 bit words... each of which is an address... we use our lookup table code to read in an address - then execute the data at that address!
Effectively, this allows us to execute commands based on single byte 'command numbers'... this can save memory if we need
In this example, we'll define 4 silly commands to try out - they'll just show simple text to screen
|We need to define a function to execute a numbered command from
this list .
We'll take a number in via the Accumulator - double it with ASL, and load a pair of bytes from that offset in the Vector Table...
The address we got will be where we want to go, so we'll use it with an indirect jump via JMP (Z_HL)
|We can call our 'VectorJump'
command just by passing a value in A,
But if we want to be really powerful, we can process a 'CommandList'... with a set of numbered commands!
|We'll need to define this command
list, and also a few strings...
If we want, we can use Symbols defined with EQU to give 'names' to these numbered commands!
|The result of the calls at the start, and the command list are shown here... you can try changing the command numbers and see the results|
have AWESOME POWER! They allow us to turn a number into a
executed command - in this case we've effectively created a
scripting language!... because each command is just one byte...
we could have hundreds of calls and save lots of space compared
to sets of JSR's!
|Self Modifying code is where our program overwrites parts of
itself... why would we want to do this? well rather than a
condition and a branch, there may be times where we can just
reprogram a jump - and rather than loading A from a memory
address, we could just reprogram a LDA command...
The reasons we may want to do this are twofold - saving speed, and saving bytes (though saving bytes will also usually save speed!)
This routine has two pieces of self modifying code... rather than PHA/PLA and TXA/TAX - we'll use self modifying code to restore X by replacing the byte at the end of LDX with the correct value
Also we'll self modify the last byte of a Jump to cause the Vector jump - this is much simpler than the indirect jump we used before, but relies on all the addresses of the @ to have the same top byte
How can we makes sure all the commands have the same top byte? well we need to pad our code with 0000's until a new byte starts (for example $1200 or $1300)
With VASM - the Align command takes a parameter which is a number of bits to align by - for example ALIGN 2 will align to a 32 bit boundary - and ALIGN 8 will do what we need - and align to a byte boundary - note, this command will be different on other assemblers.
|Self Modifying code allows for extra speed and saves memory - but it's complex and only works from RAM - so if your program is running in ROM it won't work.|
|We can use vector tables to create 'modules' of code and execute them with a single call - with a 'parameter' which defines the command number - The calling code doesn't need to know the internals, so long as each numbered command does the same job it will work fine... this allows you to have different loadable modules, and the internals can change so long as the base call and functions of each numbered command does not.|