Project

General

Profile

Resolved: Crash and freeze with a RFM69

Added by jeromelebel about 4 years ago

Hello,

I have a arduino mega with a RFM12b. I wanted to move to RFM69. I bought 2. I updated my software with the latest version of jeelib. With my arduino mega, my code never returns from rf12_initialize(). I also tried it with the demo and an arduino uno. But I have several reboots until the code freezes too.
Since I bought 2 RFM69, I tried with both. But I have the exact same problems.

I’m not sure what is wrong and how to debug that issue.

Thanks for the help


Replies (69)

RE: Crash and freeze with a RFM69 - Added by martynj about 4 years ago

It’s all in the pictures - those two pins are bent at right angles so they do NOT go into the UNO header - then solder to flying wires (supplied in the kit) and plug in as shown.

This assumes you built the radio daughter board using all the supplied components. 6x resistors, caps and 3 pin regulator.
Fully assembled, the daughter board needs a 5V supply (wandering wire # 1), is 5V signal tolerant and wandering wire # 2 connects to the INT0 position on the UNO header.

This brings the SPI bus length under control - many users are running with full SPI speed in this configuration.

RE: Crash and freeze with a RFM69 - Added by IngmarVerheij over 3 years ago

So I’ve got the same issue as jeromelebel and was wondering if this was ever solved…?

Basically the RFM69cw is found when the Uno boots. It can be configured but when I send data (eg 0t) it hangs in a reboot loop (see below).
What’s “strange” is that when I unplug the IRQ pin (D2) the loop is aborted and the device boots normally.

#define PINCHG_IRQ       0

is set in the sketch, RF12.cpp and RF69_avr.h.

I’ve tried to divide the SPI speed by 4 (even uncomment setting the pinMode of the RFM_IRQ o INPUT and enable the pull-up)…. doesn’t work.
For a brief moment it seemed to work when I lowered the Serial speed from 115200 to 57600, but that didn’t last that long

So, I have no idea how to fix this… help is welcome :-)

PS: A picture of the setup is attached
PS: I’ve just ordered a RFM12b as a short term solution (to extend my OpenEnergyMonitor solution) but like to use the RFM69cw’s in the future…
<pre>
[RFxConsole.0]FB
Available commands:
i~~ set node ID (standard node ids are 1..30)
b - set MHz band (4 = 433, 8 = 868, 9 = 915)
o - change frequency offset within the band (default 1600)
96..3903 is the range supported by the RFM12B
g - set network group (RFM12 only allows 212, 0 = any)
c - set collect mode (advanced, normally 0)
t - broadcast max-size test packet, request ack
…,a - send data packet to node , request ack
if using group 0 then sticky group number is used
…,s - send data packet to node , no ack
… - space character is a valid delimiter
128,,n - release group/node index number entry in eeprom
n - set group as sticky. Group 0 only, see p command
l - turn activity LED on PB1 on or off
…,m - add message string to ram, see p command
,,p post semaphore for group node , to be
sent with its next ack. Group number becomes sticky
q - set quiet mode (1 = don’t report bad packets)
x - set reporting format (0: decimal, 2: decimal+ascii
- 1: hex, 3: hex+ascii)
v - return firmware version and current settings
123z - total power down, needs a reset to start up again
Current configuration:
^ i30 g210 @ 868 MHz
> 0t
test 0 -> 66 b

[RFxConsole.0]00
[RFxConsole.0]00
[RFxConsole.0]00
[RFxConsole.0]00
[RFxConsole.0]00
[RFxConsole.0]00
[RFxConsole.0]00
[RFxConsole.0]00
[RFxConsole.0]00

RE: Crash and freeze with a RFM69 - Added by JohnO over 3 years ago

Ah, you are using my code, RFxConsole. May I ask which branch of jeelib library have you downloaded and if is a recent one?

RE: Crash and freeze with a RFM69 - Added by IngmarVerheij over 3 years ago

I’ve downloaded the “latest” version this morning

RE: Crash and freeze with a RFM69 - Added by JohnO over 3 years ago

What is the branch name?

RE: Crash and freeze with a RFM69 - Added by jeromelebel over 3 years ago

Unfortunately, I gave up. I didn’t have enough time to understand. I still wish to have the solution.

RE: Crash and freeze with a RFM69 - Added by JohnO over 3 years ago

When you are able to invest the time jeromelebel I am sure we will find you one.

RE: Crash and freeze with a RFM69 - Added by sschneider over 3 years ago

Hi all,

I have been having the same problem for months and have pretty much dug into the code in the same places, same hangs, same reboots, etc. I also was confused as to why I don’t’ see everybody having this problem. However, I did find a solution (not code) that is working for me with many of the RF69 based versions of the Jeelib.

In my case, it really has nothing to do with the software, but the hardware, and it explains why others aren’t seeing this on various boards (Jeelib, Moteino, Anarduino and other custom nodes). What I found out is that trying to drive the RF69 with the same voltage divider circuit used for the RF12B doesn’t work. Even when I replaced the voltage divider circuit (for logic level conversion from 5.0 to 3.3) with a proper 74HC4050 Level-Shifter it still didn’t work. Most people are assuming that going from 5.0 to 3.3 needs the level shift, but when going from 3.3 to 5.0 you don’t need to do anything. While that may be technically correct, in practice for whatever reason it doesn’t work on the RF69’s. One way to test for yourself (but not my recommendation) is to do away with the resistive voltage divider and drive the RF69 at a full 5.0 volts from Arduino (for short period of time). For me it has been working perfectly. This also explains why if you are running a Jeenode at 3.3 volts you don’t see this problem. As long as Vcc and the SPI bus are all at the same voltage the RF69 works fine for me in all cases. The only solution I have not tried is a full 5.0 - 3.3 and 3.3 - 5.0 proper level-shift. Again, clean up your circuit and run the RF69 directly from the 5.0 volt arduino to verify that this is the problem (remember that you probably have made many changes to the original JeeLib library so you might need to revert. ALso remember that your are NOT supposed to drive the RF69 at 5.0 volts. But I ahve been testing for days (few hours at a time and it works perfectly with the OOTB Jeelib RF69 support

I realize that this may not be the case for everybody, but I have been watching this thread for a while, searching the communities and everybody is looking in the same place with no stable solution. But it’s only when working with a Arduino running at 5.0 volts. At least this is 100% the case for me.

RE: Crash and freeze with a RFM69 - Added by JohnO-mobile over 3 years ago

Wow, a very interesting post. I hope martynj has something in mind to avoid the 5v.

RE: Crash and freeze with a RFM69 - Added by martynj over 3 years ago

sschneider,

Thanks for the observations - the symptoms clearly show that the SPI bus is not happy. I suspect the overdrive to 5V is masking the real problem.
Can you post pictures of your setup (I’m interested in the length/layout of the SPI interconnections) and ideally any scope shots of the SPI waveforms?

As you noted, running the RFM69 at 5V is well outside the RF chip absolute maximum rating of 3.9V for Vdd
The specification sheet has the following (perhaps over cautious) warning:

Stresses above the values listed .... may cause permanent device failure. 
Exposure to absolute maximum ratings for extended periods may affect device reliability.

RE: Crash and freeze with a RFM69 - Added by IngmarVerheij over 3 years ago

To be honest, this is a “bit” out of my comfort zone. But it does sound plausible and I’m happy to contribute to a permanent solution.

I found multiple people who investigated how to run the RFM69 on a 5V circuit, this gal went all the way and seems to describe it very well:

Hope this is helpful for those who actually understand what she wrote and might find a solution :-)

martynj: I posted a picture of my setup, the wires are as short a possible

RE: Crash and freeze with a RFM69 - Added by JohnO over 3 years ago

So, is it actually the RFM69 with the issue, as I read the post the RFM69 is signalling back at 3v3 to the ATMega which is looking to receive a 5v swing.

Also the reset posted by IngmarVerheij appears to show the ATMega crashing not just failing to communicate SPI.

RE: Crash and freeze with a RFM69 - Added by martynj over 3 years ago

@Ingmar,

Please bear in mind that physically short is not always electrically short (e.g. there is hidden extra shunt capacitance loading the SPI lines there from the breadboard internal tracks). To aid the diagnosis, it would be helpful to use this stacking style of connection.

This keeps the bus electrical length to the minimum.

RE: Crash and freeze with a RFM69 - Added by sschneider over 3 years ago

@martynj. Apologies, I do NOT want to emphasize using 5 v. I was only suggesting that for a super quick verification and I agree that it might “mask” a problem. However, when running my nodes at 3.3 (not connected to the USB arduino board but rather a ATmega328p or ATtiny84 from battery power everything works fine (No code changes to Jeelib library), in these cases I am not over driving the RF module. This would also explain why I don’t see a lot of people having problems with this when they are using a Jeenode (which runs at a regulated 3.3) I have browsed many forums and I’ve have only witnessed people having problems when connected to some type of environment that is at 5 volts (presumably to be able to use the serial monitor to debug problems).

I agree that there definitely might be a software solution related to the SPI bus (I remember months ago that the Airspayce libraries worked much better for me, but they are too big for the ATtiny’s) I never understood why my standalone battery powered nodes worked fine, but I could never use the serial monitor to debug other code (very useful). I guess I am just suggesting to folks that aren’t comfortable digging in the code for a solution, just try on a properly driven 3.3 node. It works.

I’ve posted my 5.0v temporary test (which by the way is terrible, long loose connections and bad solder joints (with 1000’s of successful transmissions). As a matter of fact, I have crappy wires soldered to the bottom of my “node” directly to the SPI pins that run back to the Arduino (again, I do realize that this could be masking a problem). However, if you want to see my 3.3 connection. Just look at a JeeNode v6. All I did was build a JeeNode v6 replacing the RFM12B with the RFM69CW and used the RFxConsole Library unmodified.

Again, it is NOT a solution to use 5.0, I was just pointing out that most folks are seeing problems while using it in 5.0v environment with a level-shift. I think I screwed around for weeks in my development environment trying to figure this out (never a solution), but magically my remote nodes running at 3.3v worked right away. I was hoping this might be able to point you guys to reasons the library isn’t working in a level shifted environment (timing, capacitance, etc…) all I know for fact is that I have the same problems that I have see on many forums attachInterrupt, reboots, hangs, 90% packet loss until I get all components operating at the same voltage.

I do think the solution is to properly level-shift all lines or run everything at the same voltage. For you guys putting in all the time with libraries (many many thanks!) I think experimenting with these voltages while you debug might help to find out how to make the library more robust in level shifted environments. I don’t have these capabilities.

RE: Crash and freeze with a RFM69 - Added by IngmarVerheij over 3 years ago

martynj
I can confirm that when I use the stacking style of connection the connection is stable (in other words, it works) - see picture

Also I found that the RF12demo in Jeelib can actually receive data from an RFM12b - but only after a full power cycle (no reset) of the Arduino. The RF12demo (and others) of the jeelib-RFxConsole library does not receive the data… Strange!
However, usually when data is received there are bad packets (CRC is not OK). Is there something to improve the quality of the data (unfortunately I need to cover two floors)?

And, I’d like to use the RFM69 from a breadboard to prototype… Any ideas how this can be used?
I built a prototype with soldered wires, which has the same issue (implying it is very sensitive to the wire quality).

RE: Crash and freeze with a RFM69 - Added by martynj over 3 years ago

@Ingmar,

Progress! Now the setup is more stable, the debugging can proceed.
You mentioned bad CRC packets - a trace of these mixed in with the good will be useful. The RSSI value printed on the end of the line by rf12demo is informative.

First turn off quiet mode with the q command - this gives a measure of how noisy the background environment is.
Then turn it back on to filter packets that have the correct group id. The background bad packets are still there, but no longer displayed.

Of course if the Rx is busy with a bad packet, it cannot see a good packet coming in at the same time, so some good packets can be missed. Putting a sequence number in the Tx packet is helpful to see this happening.

RE: Crash and freeze with a RFM69 - Added by IngmarVerheij over 3 years ago

Progress indeed! :-)
How should I read the RSSI value?
There are good packets (node id 20 - which is 50cm away) with an RSSI of ~90
Bad packets (node id 5,6,7,10 and 15 - which are two floors below) with an RSSI ranging between –95 and ~107. Because the data is garbled the node id isn’t always visible (see trace below).
A simple conclusion could be: the distance is too far. However what’s strange is that everything that I sent from this device (node id 30) is received succesfully on the node (15) which is two floors below…
So:
* node 30 (this device ) —> node 15 (RPi) = 100% successrate
* node 15 —> node 30 = bad packets
<pre>
[RF12demo.12] ^ i30 g210 @ 868 MHz
Available commands:
i - set node ID (standard node ids are 1..30)
b - set MHz band (4 = 433, 8 = 868, 9 = 915)
o - change frequency offset within the band (default 1600)
96..3903 is the range supported by the RFM12B
g - set network group (RFM12 only allows 212, 0 = any)
c - set collect mode (advanced, normally 0)
t - broadcast max-size test packet, request ack
…, a - send data packet to node , request ack
…, s - send data packet to node , no ack
q - set quiet mode (1 = don’t report bad packets)
x - set reporting format (0: decimal, 1: hex, 2: hex+ascii)
123 z - total power down, needs a reset to start up again
Remote control commands:
,,, f - FS20 command (868 MHz)
,, k - KAKU command (433 MHz)
Current configuration:
^ i30 g210 @ 868 MHz
? 80 32 161 133 0 32 129 0 49 0 52 0 0 16 16 34 20 40 0 97 34 (–105)
? 128 (–105)
? 10 40 246 220 0 168 23 105 0 128 222 115 0 92 11 0 0 18 5 0 0 (–93)
? 202 1 66 16 64 32 160 144 0 0 0 160 48 0 32 112 16 32 40 32 40 (–105)
? 40 0 0 0 0 0 0 0 64 0 0 32 0 0 0 0 0 0 0 0 0 (–104)
? 196 168 17 25 48 106 40 52 8 224 49 20 32 72 96 50 35 (–104)
? 160 4 0 51 17 1 0 0 34 64 17 32 40 0 32 49 52 0 16 0 18 (–104)
? 64 (–104)
? 194 0 42 129 5 4 133 48 48 24 32 161 160 144 0 48 0 0 16 0 40 (–106)
OK 20 210 0 0 0 39 2 27 0 (–88)
? 128 34 8 0 32 0 3 128 1 0 0 130 1 4 33 16 46 0 8 16 40 (–104)
? 64 (–103)
? 0 36 41 0 32 32 2 64 32 32 32 32 48 32 32 0 100 32 178 0 32 (–105)
? 2 42 246 156 0 168 23 105 0 0 220 115 0 92 11 0 0 18 5 0 0 (–95)

RE: Crash and freeze with a RFM69 - Added by martynj over 3 years ago

@Ingmar,

Could you sketch the position/distances of all the nodes involved? I’m not following your description well. The ‘50cm’ for node 20 does not match up with the reported ~ 88dBm.
Are the RF modules all “native” 868 Mhz? That packet dump from node 20 looks like an OpenEnergy device - they predominatly use 433 MHz modules.

Please extend the trace as requested:

First turn off quiet mode with the q command - this gives a measure of how noisy the background environment is.
Then turn it back on to filter packets that have the correct group id. The background bad packets are still there, but no longer displayed.

You can upload the trace file - easier to grab and filter then.

RE: Crash and freeze with a RFM69 - Added by IngmarVerheij about 3 years ago

Sorry for the late response

You're right, I was mixing up two nodes.
The one that was next to my setup actually had an empty battery (...), after replacing it the value was around -30.
The other one (indeed an emonTH) was one floor down, and that's actually the maximum range. One floor, not two. As a result, my POC failed. Switching to WiFi for now :-)

Thanks for the help!

(51-69/69)