XO 4 CForth Source Code Tour
This is a barely-edited transcript of a IRC discussion about the CForth source code for CL4. Toward the end there is a discussion of a possible bringup scenario.
CForth Source for CL4
CForth is built automatically by CL4 OFW build or you can cd to the cforth build directory and "make"
Build directory is cforth/build/arm-xo-cl4
Source directory is cforth/src/app/arm-xo-cl4
I direct your attention to cforth/build/arm-xo-cl4 That is where builds happen, although if one is building via the OFW tree, that happens automatically.
Q: reviewed. only puzzle i see is that PLATPATH isn't cl4 specific. but APPPATH* has a cl4 specificity.
The platform stuff is low-level, fairly minimal, all in C, and doesn't appear - so far - to differ between boards (So much so that the same platform code even works on Thunderstone.) by virtue of the fact that UART3 seems to be the go-to serial port for everybody. Now to src/app/arm-xo-cl4 You will notice that I have done a boatload of factoring at the file level. No more huge app.fth and significantly fewer ifdefs
the files in ../arm-mmp2 drive SoC hardware modules largely without knowledge of boards, and mmp3 is so similar to mmp2 in most respects that the mmp2 drivers "just work"
./gpiopins.fth is a key board-specific thing It is darn near identical to gpiopins.fth in the OFW tree.
look at ../arm-xo-1.75/boardgpio.fth. it has many ifdefs but they are conditional not on platforms, but rather on the existence of individual GPIOs
In general, if one is to have ifdefs, I prefer precision instead of tying to platforms. I do not always follow that rule as well as I should, but eventually I usually get around to doing the right thing.
arm-xo-1.75/memtest.fth has a rudimentary memory test that will no doubt be used heavily for a few days during bringup
The random test that is currently ensconsed therein is good for final checkout, but not so good for pinning down specific problems.
There are some primitives that are better for directed testing: arm-xo-1.75/ccalls.fth:{lfill,lcheck,inc-fill,inc-check} XO_CL4_Memory_Test
Note address offset for SP access to main memory.
Q: memtest-start in memtest.fth?
Memory address + 0x1000.0000 = SP address.
Q: ah, as in the SP effectively subtracts 1000.0000 prior to accessing memory?
If the TCM (tightly coupled memory, sort of like a poor-man's cache) is off, DRAM address 0 is also mirrored at SP address 0, so you can access it at either 0 or 0x10000000
DRAM address 0x3000.0000, which you might think would appear at SP address 0x4000.0000, is simply inaccessible. If the TCM is on, SP address 0 hits the TCM and is aliased at several boundaries.
Q: before dram init in cforth, where is cforth living? sram?
cforth *always* lives in SRAM at 0xd100.0000
Q: RAMBASE, i see.
Except for the C code that implements the PS/2 spoofer. That uses interrupts, and the ARM vector table must start at 0 (on this ARM variant), so portions of that code must appear in the 0 region. The way we do it is to turn on the TCM, copy that code therein, and thence the spoofer can do much of its work from that TCM, which is internal to the SP core, thus eliminating contention with other activities.
So I have just told you the main ways in which the mental model of the address map differs from obviousness (in SP land).
Q: what is TCM intended for?
One use is to speed up program execution in the same way that cache does. Another is to prevent the SP and the other cores from colliding over the trap vectors, which must be at address 0 in all cases. with TCM on, SP gets its own private playground near 0. Actually, TCM is ITCM + DTCM - instruction and data, but I don't think we need to go into those details. Suffice it to say that the addressing confusion around 0 exists, and the SP is well-advised to use 1000.0000 to access DRAM 0
wad: Does the MMP3 SP still have the inability to access more than the first 512MB in a bank ? wad: Oh, we don't care at this time... 1GB spread among two banks won't invoke that bug
In app.fth, board-config is where we do stuff that requires looking at the board ID and fiddling something as with the fuse fixing, voltage setting, frequency changing, etc that we had to tweak on different versions of 1.75
Q: And because this is for bringup, it does nothing yet, because we don't know we have ec comms yet? compared against xo-1.75/app
Because we don't yet know what we have to fix
Look at arm-xo-cl4/clockset.fth
Q: reviewed. no similar thing in thunderstone?
It's there, but in initdram.fth, and it's dead-nuts identical because we are using the same operating point
That code is translated from a set of register accesses listed in the "NTIM" file that is part of the Thunderstone demo package
As you may recall, the way this puppy normally boots is that the SP runs code from its masked ROM, which searches for an NTIM structure on one of several storage devices.
Q: yes, presumably not much change from Forth_Lesson_20#ARM_Startup
Said NTIM specifies that various modules are to be copied to various places in memory but, since DRAM is generally not on at that time, the NTIM can also sequence a bunch of register writes to fire up memory and whatever
Q: yes, we do that in cforth so we can do it differently if we need to.
Instead of using that feature, we choose to load CForth into SRAM and let it do the magic but the process is somewhat complicated by the bug that prevents loading more than a certain amount from SPI so we have to load a small shim, and let it bootstrap the full CForth into SRAM. Anyway, I cribbed the recipes for configing the clocks and for initing the DRAM from that NTIM
It appears that our DRAM is 100% compatible with what Tstone is using, timing and layout wise so, we may hope that those recipes, which have been tested on Cforth on Tstone, will continue to work on CL4
ofw.fth, which is the same across all XO boards, is the recipe for loading OFW into memory and starting the main core running on it
There is a potential issue that I have just now realized One complication of ofw.fth is the attempt to chart progress via hex numbers on the screen While we have the same kind of frame buffer in general, it's possible that there are some wrinkles that differ between MMP2 and MMP3. I don't know that there are, but neither do I know that there aren't.
Since Tstone has a wildly-different back end (HDMI, not DCON), I just commented out all that LCD code in the Tstone test.
Q: because we haven't tested it on thunderstone, i presume. therefore i expect you might remove it temporarily. and a manual test of low resolution lcd mode can be done during bringup.
I guess I could put it back in on tstone and just verify that it doesn't cause any hangs (apart from the DCON access, which is sure not to work on tstone, having no DCON to respond to the i2c commands). It's easy enough to dump the frame buffer memory and stare at it, considering the ultra-low res. I shall do that later
the other diff for ofw.fth on tstone is no SPI, so no way to load OFW therefrom
For testing, I used 2 techniques.
1) load-ofw using serial port to receive binary data of OFW into mem
2) Use JTAG script to push the OFW blob into memory
Both worked, and then I could jump to OFW
release-main-cpu is the thing that wakes up the main CPU after the data is in memory and the early startup vector has been punched in at "0" ofw-go punches in that vector. It differs from the old version in that the active code starts at 0x10, with nop's at 0,4,8,c . The old version has the active code at 0.
The reason is because, empirically when using the JTAG setup, the main cores begin running immediately, accessing unconfigured memory, and trap out to the prefetch-abort vector at 0xc
I was unable to develop a recipe for forcing them to reset the PC to 0, without crashing the JTAG system. It's possible that's a JTAG artifact resulting from the need for the JTAG software to separately try and connect with each core individually. Maybe it won't happen when running native from NTIM. I don't know. Anyway, I just moved stuff up to 0x10 with nops before, and all was well.
In release-main-cpu there is some code to release the additional 2 cores, commented-out. I got that code from a JTAG script. Empirically, it's okay to release just the one core MP core. When all was said and done, release-main-cpu became effectively identical to what we used to have at that point, although I used a bitclr instead of slamming in a 0, but same effect.
So that is cforth.
Q: And the uarts are no different, so we should at least see something.
Note that I have not tested the NTIM action - everything has been done through jtag - because I didn't want to fool with emmc (no SPI on tstone).
Q: And if we don't see anything, it implies CForth hasn't got far. Provided they didn't make major changes to the SSP we use to load code
The nice thing is, we will have JTAG initially, and with JTAG you can step through CForth, although I doubt that will be necessary. I think it will "just work", at least far enough to get a prompt.
We can put a "quit" anywhere in "app" and likely get a prompt, because the C code in "platform" inits enough to make the UART work.
Bringup
At bringup they will have a board hooked up to jtag. First smithbone will attack it to get the EC running and sequencing the power rails. Once they get that working, gary will probably probe the SPI to see if it tries to find an NTIM there. It won't find it, but we can watch it try, and that will tell us that the SoC is awake and trying. At that point the JTAG debugger can grab control and we can run a simple "cforth.xdb" script that punches the cforth image into sram
Gary: sure , but what is NTIM?
NTIM is the format for the blob in SPI FLASH - Non-Trusted Image Module
That step could be skipped, but it's a quick way to see if the SoC is trying to do something - an easy way to see if the power came up right
The cforth.xdb script is identical to the one we used for MMP2. There is also an ofw.xdb script that punches OFW into memory.
The sequence is:
a) Start XDB on host - it grabs the SoC via JTAG, resets the SoC, and stops the 4 ARM cores
b) Run the batch script "cforth.xdb", which load cforth.img (the verbatim result of compiling CForth in build/arm-xo-cl4) into SRAM and starts the SP executing it
c) Now you should get an ok prompt on the serial port
d) Test memory and otherwise poke around until memory is working
e) At this point you can go down two paths - either try to go directly to ofw (using JTAG or serial to load it), or work on the SPI
F-ofw) Run the "dlofw.xdb" batch file in XDB, thus loading OFW into DRAM
G-ofw) ok ofw-go \ Prep for running OFW
H-ofw) In XDB, type stop; set core 2; run to release the main CPU from JTAG-controlled state and OFW should start talking on serial
f-spi) Use existing procedures already documented on the wiki to have CForth punch the image into SPI flash. SPI_FLASH_Recovery_for_XO-CL4_Using_CForth and SPI_FLASH_Recovery_for_XO-CL4_Using_JTAG