Search content
Sort by

Showing 20 of 26 results by totalslacker
Post
Topic
Board Group buys
Re: [CLOSED] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 26/03/2014, 15:18:10 UTC

Quote
But when we tried to push it past 1050MHz clock (to all the way to 1200MHz) it seems that cgminer is showing us wrong results. Cgminer showed a bit smaller hashing speed than expected (Sys_clk * 32), but it kept on going all the way to 38GH/s per chip. HW errors were very small, smaller than 32GH/s settings. Did not have any rejections or stales.


Hello,

a diverging hashrate at pool and cgminer simply means you are losing shares through HW errors.

What you need to consider is:

a) a detected HW error also implies that there were errors on true results; the related probability needs to be derived correctly, but I would assume that when you have a HW error rate of 5% it also means you are missing 5% of real results


I have noticed the same thing when pushing the hardware beyond 25GH/s. In my case I'm looping the test vector zefir had posted a while back. Since this has known nonces I can verify that the hardware is returning the correct nonce sequence. Irrespective of errors the time taken for each chip to finish a job always seems to correlate very closely to the configured hash rate.

I notice that the hardware tends to drop nonces before it starts to produce bad ones. As I push the chip harder and harder the "good" nonce rate drops to zero and bad nonces become frequent.

However, this ultimately is all a symptom of too low core voltage. 35GH/s gets pretty stable at around 1.050V. I had previously thought it stable at 0.975V but longer tests started producing more errors…

I've modified my supply to get higher output voltages but haven't gotten back to testing it yet.

You do need aggressive cooling at these voltages so be careful! You can get away with short runs with minimal cooling but be careful. Even sitting idle at these voltages it's easy to generate enough heat to pop a chip (as I learned the other day when my code crashed int he debugger and I got distracted trying to figure out a bug I had been seeing from time to time).
Post
Topic
Board Group buys
Re: [CLOSED] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 19/03/2014, 23:05:37 UTC
I failed to measure current at 40GH - I can check that next time I try it.

35GH/s was 39A@0.975V = 38W.

I did try a longer run under bfgminer and was seeing some hardware errors (about 0.5%). Not sure why my test wasn't catching them - I was just running zefir's test vector over and over (and validating that the correct nonces were returned). Guess I need more test vectors Smiley

The board did start getting pretty hot. I have a water block attached to the bottom of the board but just a heatsink on top of the chips. The heatsink was sitting at around 35C but the board (the top) itself was getting to over 60C. I suspect I need more via's to better transfer overall heat to the bottom of the board. Or figure out a heatsink that can cover the board top itself.

Clearly immersion is the way to go Smiley
Post
Topic
Board Group buys
Re: [CLOSED] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 19/03/2014, 21:18:45 UTC
I added code to trim my supply and the good news is that it looks like the board is stable at 35GH/s at 0.975V. To get any faster than that I need to get over the max 1.050V my supply is able to put out (to do that I need to disassemble the cooling and change a sense resistor - not terrible but it is a hassle).

A single chip (the board has four) almost works at 1.050V/40GH/s. If I run all four then the supply under load drops to about 1.030V which doesn't work too well.

Need to validate this on more than one board of course Smiley
Post
Topic
Board Group buys
Re: [CLOSED] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 18/03/2014, 18:30:25 UTC
Did Bitmine change chips specs?

Wasn't turbo 40GH? Now it is 33GH... But still same price... But I do agree I don't see way to get them to 40GH with normal cooling...

Well that's sad to see… I've still been hopeful that I could get 40 to work someday… It actually does kinda work, I just notice that when I over clock the chip I start dropping nonces (when running a known nonce test case)…

EDIT2: And from what I see you can't get 33GH out at 0.85V more like 0.985 to 1 V. Also 1W in Turbo mode is fantasy... More like 1,3 to 1,5W... Hard to say how much is the loss on chip power supply...

I haven't played with increasing the core voltage beyond 0.85V - still on the list. 1V really is necessary that's going to be a decent chunk of power indeed!

Has anyone been able to get 33GH/s or higher to run?

The current pricing really does need to be adjusted. Several competitors are all coming online very soon at well under $2/GH. I'd really prefer to stick with coin craft as I have a working design and am very happy with the support I've seen (zefir plus free samples). But the current pricing level is going to make it very hard to be profitable for long...
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 21/02/2014, 00:11:27 UTC

Have you been able to measure the current or power consumed by one chip at 800Mhz?


Not a super accurate measurement, but it looks to be about 20A. This is when the chip is around 40C as it gets hotter the current goes up (I was seeing around 22-24A at 60C). I do have the board instrumented so that I should be able to get more accurate results, I just haven't written code to look at that yet Smiley

Oh, this was with a core voltage of 0.84V. So looks to be pretty spot-on at 0.67W/GH at 25GH/s (assuming you have decent cooling).
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 20/02/2014, 01:30:32 UTC

One other question here: are there known PLL settings for faster speeds?

I tried manually running through your set_pll_config code for this but the result didn't come out so well (I might very well have done this incorrectly). I wasn't quite sure what the extents of the various fields are as the data sheet looks to be different to your code (and to the various PLL settings posted).

Assuming you have a 12MHz ref clock, try this:
  • set pll_postdiv and pll_prediv to 2
  • set fbdiv to (target_sys_freq / 3), i.e. you can set your sys_clock in increments of 3MHz, with fbdiv being 9 bit you can go to 1.5+GHz

Code:
reg[0] = 0x84 | (fbdiv >> 8)
reg[1] = fbdiv & 0xff

Thanks zefir!

That looks to work very well. Unfortunately my hashing test is hitting some errors at anything much over 800Mhz.

I don't see chips dropping out or bad results being produced, just that nonces are missed. My code checks for all six nonces in sequence for each job issued (all four chips are run simultaneously and the job queue is kept as full as possible).

I don't think the nonce queue is overflowing as I'm issuing the read result command very frequently (and then checking chip status for finished jobs). I will get a bunch of no results and then a chip reports a nonce that has skipped a previous result.

I haven't looked at power at the board level at all yet - hopefully that's the issue. I think this board is running a little low on the voltage front so hopefully I can clean things up with that.

The good news is that I did get it to run at up to 1.25Ghz (40GH/s). About two-thirds of the nonces were dropped but it successfully ran the test (101 jobs) to completion.

My SPI frequency is a little low (think it's around 250Khz) but I wouldn't expect that to cause issues unless it were so low that the nonce queues couldn't be serviced frequently enough?
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 19/02/2014, 02:15:16 UTC
zefir,

One other question here: are there known PLL settings for faster speeds?

I ran a quick test (submitting 101 jobs to the four chips as fast as they could take them and then checking that I got all six of my nonces out of each one) and things looked good (no errors and came out right at 25GH/s).

I need to run this for longer of course to be sure, but I need to get the cooling going for that.

I'd also like to try pushing the chip clock rate for a short run, perhaps seeing if I can get to the 40GH/s turbo mode. I assume that's computed based on a 1.25GHz system clock so for my 12Mhz reference clock I would need an effective multiplier of 104.

I tried manually running through your set_pll_config code for this but the result didn't come out so well (I might very well have done this incorrectly). I wasn't quite sure what the extents of the various fields are as the data sheet looks to be different to your code (and to the various PLL settings posted).

Thanks for your help!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 18/02/2014, 23:15:30 UTC

Thanks for feedback and good luck (you're almost there Wink)

Yeah, now I just gotta get it to go fast Smiley Well, and get it working with cgminer. Since I'm driving the parts with a micro controller I need to write a new cgminer driver to drive that…

Before going faster I need to get some cooling in there though. I noticed on your bring up page you mention keeping the temps below 50C. Is this a hard limit?

I was just doing a longer test where I kept all four chips completely busy with your test vector and verified that the correct nonces came out. Running at 240Mhz a couple of the chips did get a little hot. The two running at 0.8V stayed well below 50C but the two at 0.84V got up to about 61C.

No errors during the 12 minutes I ran the test. Theoretical hash rate is 32 * 240Mhz = 7.68GH/s and actual turned out to be 7.62GH/s.

But, how bad is it to run the chips that hot? Is that a might get bad results now and then or a might damage chips?

Thanks!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 18/02/2014, 22:59:58 UTC
Having some trouble with getting 0x04 responce on the 0x04 reset command.
I'm doing the HW reset as described. The signals from raspi passed through level shifters.
Chip select DI CLK are ok. I see 0x04 passing into the chip but there is nothing at the output.
The VDDcore is 0.7 volts. Maybe it is too low?

That's pretty likely too low. I had similar issues when I was running the chip that low. 0.8V seems to be pretty reliable for at least basic comms for most chips. A couple wouldn't hash at that voltage but once I brought it up to about 0.84V they were fine.

Of course, now those chips run hotter than the others Smiley
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 18/02/2014, 16:35:40 UTC

You will notice that you get back 4 results, although the job has 5 winning nonces. That is because the A1 has an output queue of 4 elements, so one of the results is overwritten. To get them all, you need to pull the results early enough while chip is still hashing. Have a look at the reference driver to check how to do this continuously.


Thank you! I had indeed switched over to cgminer but getting the test to run correctly makes me feel a lot better about the status of my hardware. Smiley

Interestingly, I get six nonces? The four you post, the one I assume that was overwritten and an extra. Same results for all four chips (haven't tried the second board).

Code:
************** Got nonce! 18 8d b1 99
************** Got nonce! 3a a6 b2 0c
************** Got nonce! 3f 8f 64 de
************** Got nonce! b9 9c c7 09
************** Got nonce! be 7b 58 b3
************** Got nonce! ec c1 4e 74


What you get back from the chip is a valid Diff1 share, while your pool is obviously asking for higher difficulty shares. That's absolutely normal, i.e. you will see this trace log with every HW that produces Diff1 shares. cgminer then drops all those below pool's difficulty.


OK, perfect. I figured the target difficulty wasn't right given it's fixed in the job creation function. I will integrate your new code at some point. Until then, you're correct: getting confirmation of things working is very helpful!

First, however, it's time to do some reliability testing with the test vector Smiley
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 18/02/2014, 04:00:56 UTC

So perhaps I have a nasty bug somewhere…


I brought up the second board we had built up fortunately everything pretty much just worked - getting spoiled with that happening twice in a row Smiley

It still produces the same nonce on your test hash zefir (and it's consistent at a range of clock rates from clock_mux set (12Mhz) to the default PLL at 800Mhz) so I'm pretty sure whatever I'm doing to get this result is outside of the chip.

I have a raspberry pi that should come in tomorrow. When that's here I can hook it up to the spi test headers and validate it against your driver.

In the meantime I cobbled together a really quick and dirty driver for cgminer (based of your bitmine driver) that talks to the micro controller on my board via a UART. It then sends down one item of work and waits for a response (not very speedy!). I get this sort of result in the cgminer logs:

[2014-02-17 18:30:33] BMH_scanwork: nonce=0xbe72021b
[2014-02-17 18:30:33]  Proof: 00000000a670e9d47f911dbe284736a57bb18608989637e1d99b81c8eb3fc1b0
Target: 0000000010000000000000000000000000000000000000000000000000000000
TrgVal? no (false positive; hash > target)
[2014-02-17 18:30:33] Share above target
[2014-02-17 18:30:33] YEAH: chip 0: nonce 0xbe72021b

Is this correct? Should the final hash be below the target (the "false positive" comment is unclear to me). Guess I need to learn more about cgminer Smiley

Thanks for any suggestions!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 16/02/2014, 00:23:00 UTC

I will work on getting the voltage up to 0.85V.


I used the supply margining feature and got the core voltage up to 0.85V. I'm still getting exactly the same results.

I poked around online and found some other test vectors. I don't get the same results as they do either, but I consistently get the same (incorrect) nonce's so I guess that's good Smiley I'm running multiple jobs through each chip and looping through the whole test five times (just running one chip at a time currently).

I'm seeing peak to peak ripple of about 50mV on the scope so power looks to be good. I would also expect to not get such consistent behavior if I had a power issue?

So perhaps I have a nasty bug somewhere…

Thanks!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 15/02/2014, 22:31:26 UTC

Hm, this to me looks like a power issue - at least the effects you observe are similar to what we had here until we got the DCDC stabilized.

First of all, try to not go below 200 MHz sysclock, since I am not sure if the PLL settings are correct for low values or how low it can really get. Operating the chip at 200MHz even without cooling is no problem.

Then ensure the supply voltage is above 0.85V and ripple is within valid tolerance. Same goes for reference 1V8, reset and SPI signals. If you got access to the register, I guess you did that correctly. You can try to stress-test the inter-chip SPI by continuously reading the register of each chip over a longer period to ensure there are no issues.

Usually, the serious troubles begin when you supply the chips with work and they start to hash. The power draw immediately spikes for order of magnitudes and if DCDC is not capable to handle that, the voltage ripple eventually will exceed the tolerance. The chip then usually resets itself, and with that you usually lose access to it, since the chip becomes unaware that it is part of a chain. To regain access to it, you need to HW reset the whole chain and re-enumerate the chips again.

A strong indication that the chip was reset after it started hashing is the inability to read out its register, i.e. you e.g. write 0x0a02 to get register of chip 2, and you read back 0x0a02 instead of 0x1a02, meaning there is no chip 2 in the chain any more.

We detected problems in the DCDC by scoping the levels long term and triggering for levels outside the tolerance range.


As for your other issue with the endianess of the job command: the provided driver uses 8bit transfers and the create_job() function prepares the data for byte-wise operation. If you are not using 16bit and did not modify the source code, please post a trace of the related SPI transfer and I will double check with my logs.


zefir,

So my core voltage was a bit low - I had it at the lower end at 0.82V. I started with changing that to 0.84V as that was easy (to get to 0.85V I need to muck with the trim registers in the power supply - the documentation isn't really clear and I can way over volt the chip if I get it wrong so I didn't want to try that right away).

Things did get much better, but still not 100% so I think you are spot on for the power issue. All four chips are at least claiming they can process jobs.

When I was getting that single nonce that was with sending the data you had posted directly rather than going through create_job. If I fix the endianness with create job I get no results… I am using a byte wise SPI transfer so I believe any 16 bit endiness issues should be ok.

The data I am uploading to chip 1 is (this is in transfer order):

17 01 49 4b f3 70 41 71 f3 e8 3f ea 17 04 10 d8
dc 17 8f 80 2f 12 6d cd b7 e4 2c 25 0a d8 18 a3
1f 8c 01 8e 98 d6 52 66 1f 27 19 10 0a b6 00 00
00 00 ff ff 00 1d ff ff ff ff

I did try faster PLL's and am not seeing any difference in behavior.

I will work on getting the voltage up to 0.85V.

Thank you!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 15/02/2014, 02:17:16 UTC
OK, so got boards back this week and initial checks all look good! Smiley Have reliable SPI comms with all four chips and power looks to be nice and stable (for at least low frequency operation, need to add cooling before pushing them).

Zefir, your bring up instructions were very helpful - thank you! I was having trouble getting the chip to accept a job and then I finally realized I wasn't reading your instructions properly (had skipped a section). Fixed that and the chips are hashing Smiley

But, I'm not getting the same result as you test vector at https://bitcointalk.org/index.php?topic=294235.msg4454746#msg4454746

The registers all look to be good. I can submit the first and second job. The appropriate job active bits are set and the correct job id is set. But, I only get one nonce: 47 b8 a6 62

I get this same single nonce on both jobs and the two chips I've tried. I've looked over the test vector and compared it to my code as well as capturing the SPI bus on a SPI analyzer and it all looks correct…

Might anyone have some suggestions on what I'm missing?

Thanks!

So I realized that the midstate and wdata might be in the improper byte order so I swapped them using the create_job function from zerfir's driver as a reference. No joy though as now I'm not getting any nonce solutions.

It was interesting though that the first chip initially was taking a very long time to run the two jobs - as in a good 30 seconds (at 90Mhz PLL). Chips 2 and 4 ran it in a few seconds as expected (chip 3 in my chain doesn't seem to be functional - SPI communications look to be good until I send it a job at which point it stops responding).

I powered the system down for a while then tried again, chip 1 was back to running at normal. So that's a tad concerning (the PLL had successfully locked in the first test where it was slow and the SPI clock was at the correct frequency for 90Mhz operation). Maybe some hash engines weren't functional (even though it reported all 32 as good)?

Any pointers on my incorrect nonce results would be greatly appreciated!

Thanks!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 14/02/2014, 22:47:24 UTC
OK, so got boards back this week and initial checks all look good! Smiley Have reliable SPI comms with all four chips and power looks to be nice and stable (for at least low frequency operation, need to add cooling before pushing them).

Zefir, your bring up instructions were very helpful - thank you! I was having trouble getting the chip to accept a job and then I finally realized I wasn't reading your instructions properly (had skipped a section). Fixed that and the chips are hashing Smiley

But, I'm not getting the same result as you test vector at https://bitcointalk.org/index.php?topic=294235.msg4454746#msg4454746

The registers all look to be good. I can submit the first and second job. The appropriate job active bits are set and the correct job id is set. But, I only get one nonce: 47 b8 a6 62

I get this same single nonce on both jobs and the two chips I've tried. I've looked over the test vector and compared it to my code as well as capturing the SPI bus on a SPI analyzer and it all looks correct…

Might anyone have some suggestions on what I'm missing?

Thanks!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 13/02/2014, 19:44:56 UTC

FYI my design is a direct USB-SPI design. I know there are some concerns about CPU load on the host when communicating this way, but to me direct USB-SPI is the fastest way to a working board since I don't have to develop µ-controller firmware.


I went a similar route although I have both a micro controller and a host SPI interface. I just used the header interface that totalphase.com uses for their beagle and aardvark SPI/I2C monitor/host adapters. This way I can use the aardvark tool to test out the SPI (and I2C that I use for other functions) functionality on the board without writing any code (they have a pretty nice cross-platform application for this) and then as I bring up the micro-controller I can monitor what actually happens on the bus.

So far it's working well, although I'm currently testing out the power supplies, getting the A1's going will be soon! Smiley
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 13/02/2014, 01:53:33 UTC
Both Wink  I must have started developing the driver based on an initial version of the specs and did not modify naming to the updated ones. Will be fixed in the driver I'll provide for upstream integration.

As for the 'work done' flag: this is something I found out tracking the register (already described somewhere in this thread) and which is currently being worked on to get integrated into the data sheet update by Bitmine. They are currently in the final steps of ramping up production, so please be patient for that.

OK - thank you! I think I saw that original post a long time ago and then failed to recall it Smiley

The chip chain partitioning comment there is also a good one. I currently have a four chip chain in my board (just got it back, am testing it now), I will look at partitioning that in the next revision.
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 11/02/2014, 18:37:16 UTC
Zefir,

I was reading through your cgminer driver and had a question:

In the A1_scanwork function I see the following code:

Code:
hexdump("A1 RX", a1->spi_rx, 8);
if ((a1->spi_rx[5] & 0x02) != 0x02) {
work_updated = true;
struct work *work = wq_dequeue(&a1->active_wq);
assert(work != NULL);

I assume this is checking a "work done" flag in the register, however I don't see this bit defined in the data sheet?

Actually, while I'm here it also seems that the driver code is setting the PLL values differently than in the data sheet. The code sets pre_div to be the first two bits in the register data whereas the data sheet defines it as being bits 44:40. Is the data sheet wrong or am I just reading this incorrectly?

Thank you!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 02/02/2014, 18:57:14 UTC
So I'm in the final stages of PCB layout here (hoping to release the board today or tomorrow). Still one question is pending however: are there any sequencing restrictions on bringing up the IO/analog and core voltage supplies for the A1? I'm currently planning to bring up IO, then the core supply - will this be ok?

I've asked Bitmine as well and the question has been forwarded to engineers but no reply as yet.
There are no related restrictions documented or known and I am not aware that anyone from the working designs is keeping some defined bring-up order. The only requirement documented is the reset sequence (1s low, then 1s high before first command is sent).

OK, thank you!
Post
Topic
Board Group buys
Re: [OPEN] Bitmine CoinCraft A1 28nm chip distribution / DIY support
by
totalslacker
on 01/02/2014, 21:20:07 UTC
So I'm in the final stages of PCB layout here (hoping to release the board today or tomorrow). Still one question is pending however: are there any sequencing restrictions on bringing up the IO/analog and core voltage supplies for the A1? I'm currently planning to bring up IO, then the core supply - will this be ok?

I've asked Bitmine as well and the question has been forwarded to engineers but no reply as yet.

Thanks for any help!