+
+This will not only help to find errors between separate modules, it will also
+generate better code, since the compiler must not assume that a variable sized
+parameter list is in place and must not pass the argument count to the called
+function. This will lead to shorter and faster code.
+
+
+
+
+
+Variable declarations in nested blocks are usually a good thing. But with
+cc65, there is a drawback: Since the compiler generates code in one pass, it
+must create the variables on the stack each time the block is entered and
+destroy them when the block is left. This causes a speed penalty and larger
+code.
+
+
+
+
+
+The compiler needs hints from you about the code to generate. When accessing
+indexed data structures, get a pointer to the element and use this pointer
+instead of calculating the index again and again. If you want to have your
+loops unrolled, or loop invariant code moved outside the loop, you have to do
+that yourself.
+
+
+
+
+
+While long support is necessary for some things, it's really, really slow on
+the 6502. Remember that any long variable will use 4 bytes of memory, and any
+operation works on double the data compared to an int.
+
+
+
+
+
+The CPU has no opcodes to handle signed values greater than 8 bit. So sign
+extension, test of signedness etc. has to be done by hand. The code to handle
+signed operations is usually a bit slower than the same code for unsigned
+types.
+
+
+
+
+
+While in arithmetic operations, chars are immidiately promoted to ints, they
+are passed as chars in parameter lists and are accessed as chars in variables.
+The code generated is usually not much smaller, but it is faster, since
+accessing chars is faster. For several operations, the generated code may be
+better if intermediate results that are known not to be larger than 8 bit are
+casted to chars.
+
+When doing
+
+
+
+When indexing into an array, the compiler has to calculate the byte offset
+into the array, which is the index multiplied by the size of one element. When
+doing the multiplication, the compiler will do a strength reduction, that is,
+replace the multiplication by a shift if possible. For the values 2, 4 and 8,
+there are even more specialized subroutines available. So, array access is
+fastest when using one of these sizes.
+
+
+
+
+
+Since cc65 is not building an explicit expression tree when parsing an
+expression, constant subexpressions may not be detected and optimized properly
+if you don't help. Look at this example:
+
+
+
+Labels that appear first in a switch statement are tested first. So, if your
+switch statement contains labels that are selected most of the time, put them
+first in your source code. This will speed up the code.
+
+
+
+
+
+The compiler not always smart enough to figure out, if the rvalue of an
+increment is used or not. So it has to save and restore that value when
+producing code for the postincrement and postdecrement operators, even if this
+value is never used. To avoid the additional overhead, use the preincrement
+and predecrement operators if you don't need the resulting value. That means,
+use
+
+
+
+The compiler produces optimized code, if the value of a pointer is a constant.
+So, to access direct memory locations, use
+
+
+
+Initialization of local variables when declaring them gives shorter and faster
+code. So, use
+
+
+
+When addressing an array via a pointer, don't use the plus and dereference
+operators, but the array operator. This will generate better code in some
+common cases.
+
+Don't use
+
+
+
+Register variables may give faster and shorter code, but they do also have an
+overhead. Register variables are actually zero page locations, so using them
+saves roughly one cycle per access. Since the old values have to be saved and
+restored, there is an overhead of about 70 cycles per 2 byte variable. It is
+easy to see, that - apart from the additional code that is needed to save and
+restore the values - you need to make heavy use of a variable to justify the
+overhead.
+
+An exception are pointers, especially char pointers. The optimizer has code to
+detect and transform the most common pointer operations if the pointer
+variable is a register variable. Declaring heavily used character pointers as
+register may give significant gains in speed and size.
+
+And remember: Register variables must be enabled with Decimal constants greater than 0x7FFF are actually long ints
+
+The language rules for constant numeric values specify that decimal constants
+without a type suffix that are not in integer range must be of type long int
+or unsigned long int. This means that a simple constant like 40000 is of type
+long int, and may cause an expression to be evaluated with 32 bits.
+
+An example is:
+
+