The Web Site to Remember National Semiconductor's Series 32000 Family

Jon

In Spring 2017 I found the website of Jon Elson. He has done some interesting projects. Of course for me his Series 32000 projects are most interesting. For the other projects please visit his website pico-systems.com .

The Nat Semi 16032 Project

I asked Jon whether I can present some of the photos and the text of his website on my Series 32000 website. He said "yes" and here is the description of his self-built system:

"Then, I did a little side consulting for a group at the Med School, and advised them to buy a system based on the NS 16032 that ran Genix. I didn't have the time to do a big project for them, but suggested this would be capable of doing what they wanted in a data acquisition/ data analysis project. The machine was made by Logical Machine Co. of Chicago, which soon folded. But, the machine worked, and came with schematics, so I was able to clone it! So, after some effort, I had a 32-bit Genix system running. But, it was SLOW!!! In fact, it was so slow that even editing a file was a maddening effort, but I did learn some stuff about Unix-derived systems. Oh, I had a cast-off Versatec 1200 electrostatic printer from work, which could print text at 1200 LPM, but when printing in bit-map form it was achingly slow, about 10 minutes per page. I had made some mistakes in the driver like allocating and freeing the data buffer for every data block. But, I was really working in the dark, I knew NOTHING of Unix device driver writing."

I believe that the software performance was bad and this was the reason why the system was so slow...

Fig. 1. Top side of the CPU board with all chips of the first Series 32000 generation running at 10 MHz. The NS32202 ICU is at the right edge.

Fig. 2. The bottom side of the CPU board.

I asked Jon about the color code of the wires and he gave an impressive answer:

"The color code is by wire length. I wrote a program to organize the wire-wrap process. You enter the X-Y coordinate and number of pins of each chip, and a wire list. It outputs a list of wires, partially optimized for shortest length (but if a lot of pins are on the net, it runs into combinatorial trouble and has to fudge it.) It organizes the wiring on two levels, first level on the left column, second level on the right column. And, it gives the length wire to use. I got a big kit of color-coded wire wrap wire, kind of in resistor color code by length."

The Multiprocessor System

For the first time I can present a Series 32000 multiprocessor system! Many of them have existed but in 2017 I was still looking for one. Jon built the system around 1985 and it still exists today at his home. I would really like to play with these machine and run Mandelbrot on it...

This multiprocessor system was not a general purpose computer. It was built to speed up certain calculations. Jon wrote about the system:

"One of the first projects I worked on was adding a multiprocessor to the VAX 11/780. National Semiconductor came out with the 16032 (later renamed the 32016) which was fairly close to a VAX on a chip. 32-bit architecture, but a 16-bit external bus. Separate clock generator, separate FPU, separate MMU. They made it available on a Multibus-II card pretty cheap to universities, and had cross-development software that would run on VMS. So, I got some Multibus-II memory, a backplane and a bunch of the 16032 boards, and a DR-11 DMA interface. I wire-wrapped up a DR-11 to Multibus adaptor that plugged into the Multibus backplane. I found a bug in the bus synchronizer of the 16032 board that prevented multiple masters from working correctly, it was made on a 40-pin DIP hybrid module. NS supplied me the schematic, the problem was immediately obvious, no synchronizers on the strobe inputs from the bus, so it would occasionally lock up the state machine. I added an external FF in the patch area of the board to fix that problem, and then had up to 7 boards that would run simultaneously. 7 was the limit as the motherboard didn't have round-robin priority scheduling, and 7 masters would completely tie up the bus, no more boards could get access to the memory. But, that was enough for development. We were using this to speed up "tape scanning", which meant reading nuclear events from tape, and doing calibration of the raw data, gating on particular combinations of channels coming in, and other first-step processing, and then either writing the result out to tape or disk, depending on size. The scheme was, first you converted the user's FORTRAN program from VAX form to NS's form and merged it into the multiprocessor framework, and then ran it through the NS cross-compiler. Then, you ran a wrapper program that dealt with the tapes and disk files. It would load the cross-compiled program into the global memory of the Multibus system, put a code word into a specific location, and then trip a master reset line. All the 16032's would see the code word and copy the cross-compiled program to local memory on each board and begin executing it. The wrapper program on the VAX would read blocks of events from tape and put them in the global memory, and set flags to indicate which ones were new. The 16032's would use an atomic read-modify-write to allocate these flags indicating they were going to handle that block, and then process the data. When they needed to increment a histogram count, they would use an atomic RMW cycle to update that word (the 2D histograms took up most of the Multibus memory). This thing worked quite well, but the 16032 was no speed demon, even without memory management. 7 of them was equal to about 2 to 2.5 VAX 780's. I could have made the program that copied the cross-compiled program to local memory take a break every 128 words and probably that would have made the system handle more CPUs, but wasn't sure how many more it would handle before bus contention became an issue again."

Fig. 3. Jon's multiprocessor machine uses 11 boards: "The wire-wrapped DMA interface is just to the right of the power supply, then there are 7 DB32016 boards, 2 1 MB memory boards and a bus indicator board."

The machine is running at 8 MHz clock frequency. If 10 MHz could be used the performance gain would be 2.5 to 3 times the VAX 780 which is running at 5 MHz. The power supply is able to deliver 100 A at 5 V. Each processor board consumes around 6 A.

Fig. 4. Jon's CPU board looks very similar to the DB16000 board which can be seen at Systems/National Semiconductor .

"There are a couple of 14-pin DIPS added to the DB32016 to allow the hybrid bus controller to be synchronized to an external clock. Nat Semi was REALLY nice about sending me the schematics of this part they bought, and it was immediately obvious why it was not working reliably. They assumed the DB32016 would provide the clock for the Multibus."

The added devices are located in Figure 4 in the upper left corner. The hybrid synchronizer is located in the lower left corner. Its name is DH8218. On the photo of the DB16000 at Systems/National Semiconductor an Intel D8218 device is found in the same socket. Obviously National had problems getting final parts of this device.

Figure 4 shows also for the first time an NS32016D-8 CPU specified at 8 MHz clock frequency. The NS32081 FPU and the NS32201 TCU are able to run at 10 MHz but the TCU quartz is only a 12 MHz part.

To get an overview about the Multibus it is useful to read the DB32016 User Manual.

Fig. 5. Jon wrote about the backside of the CPU board: "The rat's nest of wires on the back hook the patch chips to the bus controller."

Fig. 6. The DMA interface board. No programmable logic is used.

About the interface between the Multibus and the DEC DR11 DMA interface in the VAX system Jon wrote:

"There was an address counter on the board, so you could set the address on the Multibus side, and then do DMA transfers either way between the VAX and the Multibus memory. I patched the monitor ROMs on the DB32016 so if location zero of Multibus memory contained a code word (AAC55F2D) it would copy a block from the Multibus to the local memory and start executing, whenever a bus reset happened."

Fig. 7. The backside of the DMA interface board.

In the internet a description of Jon's machine can be found at http://www.chemistry.wustl.edu/.../1987_NIM_262.pdf .

A similar machine using four DB32016 boards is described at http://www.iaea.org/.../19019376.pdf . The predecessor system used the Zilog Z-8001 processor. In the document I found a nice statement about the software "development" for the successor CPU NS32016:

"The histograming software used with the data aquisition system is essentially a translation of the Z-8001 assembly language software to equivalent 32016 assembly language instructions, ..."

Those were the good old days of computing :-)

Next chapter: Labtam