Everyone. I’ve posted the source code to the demo in the post from a few days ago.
So I decided to hang up the camera above the work table and record the whole process of going from a bare board to a populated and working PCB. I do apologize for the mumbling in advance.
So as you know you can always start a good flame war when mentioning performance comparisons. I’m sure this post will qualify for that treatment. The comparison is between UTFT library, which is a fantastic AVR/PIC cross platform TFT library for a whole list of TFT driver chips and UTFT that has almost all controller conditionals removed plus some choice routines were replaced by hand crafted assembler. The performance difference is staggering.
As you can see the optimized library is 15 times as fast. Now how did we get there?
Replace _fast_fill_16 with something more appropriate to the name. Looking at the dissasembly for this piece of code I was horrified to find oodles of code for such a simple thing. All this piece needs to do toggle the WR lines to LOW and to HIGH once for each pixel. The controller will automatically advance the next write address. In AVR writing to the ports can be done in one clock cycle. So that basic element only takes two clock cycles.
.macro TOGGLE_WR_FAST value1, value2 out _SFR_IO_ADDR(WR_PORT), \value1 out _SFR_IO_ADDR(WR_PORT), \value2 .endm
load 2 registers of your choice with the values to write to the port and call this macro. What I liked about the original fast_fill_16 is that it unrolled the big loop into 2 loops. One does 16 pixels at a time and the second loops finished whatever was left over. This avoids a lot of branching, so I stuck with that in assembler. Assume that the number of 16-pixels to write is in r24,r25 and the number of single pixels is in r18
sbiw r24,0 // subtract zero and test if zero breq exitloop16 loop16: TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 TOGGLE_WR_FAST r31,r30 sbiw r24,1 brne loop16 exitloop16: cpi r18,0 breq exitsingleloop singleloop: TOGGLE_WR_FAST r31,r30 dec r18 brne singleloop exitsingleloop: ret
This alone is a tremendous speedup and takes care of fillRect, clrScr, horizonal and vertical lines.
Arbitrary lines were very slow in C code. So I rewrote the Bresenham algorithm in assembler as well. It’s a bit more code so I won’t bother copying it here. I will have the source code available soon though.
Bitmaps could also use the assembler rewrite and benefited tremendously. However when working with the Hack a Day logo I noticed that it has a lot of repeats. Getting the bitmap data from flash memory is slow. 6 clock ticks per pixel. What if I could do a rudimentary compression, RLE seemed to fit the bill. So I went with something similar to Packbits compression. This reduced the storage use from 12kB for a 16 bit 83×76 pixel bitmap to only 2944 bytes. Lossless compression! This sped up bitmap drawing very well and reduced flash usage quite a lot.
The source code for the C image converter can be found here. The assembler to actually display this bitmap is very very simple and very fast:
.global fastbitmap_pb565 fastbitmap_pb565: /* r24:r25 data */ /* this block sets up the TOGGLE_WR_FAST registers r30:r31 */ in r26, _SFR_IO_ADDR(WR_PORT) mov r27, r26 set bld r26,WR_PIN clt bld r27,WR_PIN movw r30, r24 clr r1 PB565BIT_LOOP: LPM r18, Z+ cpi r18,0 breq PB565BIT_DONE bst r18,7 brtc PB565PLAIN // compressed loop. andi r18,0x7F LPM r0, Z+ out DPHIO, r0 LPM r0, Z+ out DPLIO, r0 PB565COMPRESSED: TOGGLE_WR_FAST r27,r26 dec r18 brne PB565COMPRESSED rjmp PB565BIT_LOOP PB565PLAIN: LPM r0, Z+ out DPHIO, r0 LPM r0, Z+ out DPLIO, r0 TOGGLE_WR_FAST r27,r26 dec r18 brne PB565PLAIN rjmp PB565BIT_LOOP PB565BIT_DONE: clr r0 ret;
Those 3 really took care of the low hanging fruit. UTFT is written as a C++ class as it is a Arduino library. I wonder how much time the compiler spends on keeping track of the ‘this’ pointer and de-referencing variables.
I’m very pleased with the speedup so far. The boards have come in. Next blog post will detail the assembly of the board using mostly SMT components.
Update: Full source code is posted here
It’s been really quiet for a long time. We’ve moved states and for about a year I didn’t have my ‘garage’ to work in. But since 2 months I now have a good spot again to work on electronics. My first project for the blog is a new reflow controller using a TFT display.
I settled on an eBay bought HY32D 320×240 16 bit color TFT display with touch controller. This is a pretty neat and cheap thing. It’s based on the SSD1289 controller and is wired with a 16 bit parallel bus.
A quick back of the envelope calculation told me that on a 10MHz ATMega controller it should get about 10Hz refresh doing a single color whole screen redraw. This was assuming I could write a pixel in max 10 clock cycles. Not bad at all. Previously I had used a similar display with on an NXP LPC1768 development board clocked at 100Mhz. But it was connected serially,.. yikes this was painful to watch. You could see the screen being drawn. Not good UI experience.
I had designed a ATMega644pa board for this display and sent it out to OSHPark. While I was waiting for this to arrive the TFT displays arrived first! So now I had the displays but nothing to drive them with yet. This irked me to no end.
Luckily I had an old old prototype board with a DIP ATMega644pa 3.3v @ 10MHz sitting from when I developed a ethernet connected gizmo for my company. A quickly cut power trace to the ethernet chip gave me the perfect experimentation board. About 30 minutes later I had the display hooked up with some ugly ugly ribbon cables.
A search on the internet led me to UTFT, a fantastic cross platform TFT driver library with support for AVR and PIC and a whole range of TFT controllers, with documentation even! Very nicely done. After removing the Arduino calls from the library it was ready to be programmed onto my contraption.
Lo and behold it actually worked! However the refresh as abysmal. Even with a 16 bit bus the display was drawn very very slowly.
Time to look at the code in more detail… Next installment AVR assembler and the huge speed gains to be had.
I’ve been a bit disappointed by the 445 nm diode so far. It’s not really burning as well as the 405nm diode. So I decided to put down some estimates to see what is happening.
I was unable to focus the 445nm to a pinpoint as small as the 405nm. If I had to estimate it would be 0.4 – 0.6mm ‘dot’ for the 445nm vs a 0.1 – 0.2mm pin for the 405nm. Let’s run with that:
So unless I can get the dot size down with the better lens the 445nm will have a hard time competing with the 405nm. This is consistent with my observations on the CNC so far.
The 445nm diode came in and I’ve hooked it up to the CNC. This diode is a multi-mode laser diode. This means that the dot will not be nice and round as with the 405nm diode, but it has some real power and durability. People seem to run these at about 1amp, where the diode delivers about 1Watt of laser power. That is not bad!
So as a slow start I raised the platform to be as close to the laser as I could focus the beam (to keep the dot as small as possible) and set the current to 400mA figuring I was going to get about 400mWatt of power, however at 400mA this diode barely burns paper! The 405 diode at 400mA rips through paper like butter, needless to say a bit disappointing.
A quick look at the amp vs power graph over at Laser Pointer Forums confirms that this diode takes a good amount of amperage just to get started. At 400mA we get about 250mW of laser power. Whereas the 405nm diode gave us 400-500mWatt.
Sadly I can’t drive the power up too much further today as I am running out of juice on the power supply. That’ll be the next improvement. So this diode should really shine with a bit more amperage.
I ordered a better lens from JayRob on the forums ($13) to see if I can get this laser focused nicely.
To run a CNC laser you’ll need a laser diode driver (current source) that can be modulated. I wanted a driver that can be modulated with an analog voltage as well in case I want to give EMC control over the laser power output. After some searching I settled on a modified StanHam. I changed the design by adding soft-start capacitors (though not tested yet) and made it all surface mount components and a beefier output transistor as I intend to run this up to one amp for my 445nm diode. The board layout is as follows:
and once populated it looks like this:
Pardon the messy tin, I’m trying to protect the copper layer from oxidizing away. Be careful when trying to run this for the first time. Most likely you’ll short circuit the power supply 🙂 The driver is very sensitive to the position of the POTS RV1 and RV2. Start with RV2 set so that R6 is connected to ground and set RV1 so that R3 is connected to ground. Then apply power to P3 (0-5V) and start increasing RV1. You can estimate the current that will be flowing even before you attach your dummy laser by measuring the voltage over C4 multiply that by two and assume 1V=1A. So if you read 125mV, you driver would output 250mA. The voltage at pin1 of the IC should be twice the voltage at pin 3 and equal to the voltage over R10 when a diode or dummy load is present.
1 x Transistor 2SD1758TLR
3 x Diodes 1N4148WS
1 x Dual opamp LM358MX
2 x 10k Trim pots PVG5A103C03R00
1 x SDM1210 LED LTST-C930KGKT
1 x 1 Watt 2512 SMD 1 Ohm transistor
1 x 100uF capacitor
1 x 100nF SMD805 ceramic capacitors
7 x 10k SMD805 resistors
2 x 1k SMD805 resistor
2 x 470pF 805 SMD ceramic capacitors (optional)
How nice to have an quiet Sunday working on your hobby in the garage. I spent the day preparing the CNC for the new laser diode that is coming this week. I’m getting a 1 Watt 445nm laser diode! For this purpose I thought it time to close up the CNC so no light shoots around and gets me or any of the kids 😉 and yes I do have the proper goggles as well, OD7+ for this wavelength. Don’t even think to run that diode at it’s full power without that. Be smart.
I’ve engraved the enclosing with the laser itself. The header of the blog is actually a shot of the case. Not bad looking 🙂 I created that G-Code with truetype-tracer. This delivers some smooth looking fonts and allows for ‘filling’ Note that the filling requires you to turn off block-delete in EMC otherwise it will not fill. I’ve wasted a good amount of time trying to figure that out.
I’ve been trying to get the laser to cut stencils in some overhead sheets that I bought, but so far I’ve not been happy with the results. The TQFP32 footprint is tiny and the plastic melts too much. I still have a lot of parameters to tweak so it’s a work in progress.
I seem to have broken the home switch on the y-axis. I’ll have to look at the printer carcasses I have sitting in the corner for any more switches.
I finally received the BDR-S06J sleds and extracted the diodes from them. These diodes emit light at a wavelength of 405nm and go up to at least 500mA, where they put out about 600mWatts of power! That’s not bad.