Discourse Build Error: "An error occurred while installing rbtrace (0.5.1), and Bundler cannot continue."

Instead of changing the version of rtrace, I will try removing it:

after_code:
    - exec:
        cd: $home
        cmd:
          - sed -i '/rbtrace/d' Gemfile
          - sed -i '/rbtrace/d' Gemfile.lock
          - su discourse -c 'bundle config unset deployment'
          - su discourse -c 'bundle install --no-deployment --path vendor/bundle --jobs 4 --without test development'

This works! I can rebuild by removing rbtrace completely, which I guess might be not required for production (maybe I'm wrong):

Successfully bootstrapped, to startup use ./launcher start socket3

So, I will go back to the main branch and try to upgrade, removing rbtrace and rebuild the container and see what happens.

:slight_smile:

3 Likes

Yay

We now have the container up and running:

condor2769://var/disco# docker ps
CONTAINER ID        IMAGE                     COMMAND                  CREATED             STATUS              PORTS                                NAMES
938f19881dbb        local_discourse/socket1   "/sbin/boot"             6 seconds ago       Up 5 seconds                                             socket1
5159155955c0        local_discourse/socket2   "/sbin/boot"             4 weeks ago         Up 4 weeks                                               socket2
f48ca08b40f6        local_discourse/pg13      "/sbin/boot"             23 months ago       Up 23 months                                             pg13

so, I will next try to cut over to that socket (socket1) container using my admin script.

2 Likes

OK.

We are now running the latest version of discourse with all upgrades minus rbtrace.

Please post back if you see any errors with screenshots.

Thanks.

Anyway, according to the babble-bot ChatGPT, rbtrace is not recommended for production apps and it mostly used in development, so maybe we will get lucky and all goes well :slight_smile:

I'm fairly sure the folks over at discourse meta would frown on this hack, but since they cannot (or will not) fix the error, might as well remove the underlying unnecessary cause.

Cheers

:slight_smile:

PS: For ChatGPT fans, ChatGPT could not come up with this solution after providing the error and asking many times for some fixes in various ways and prompts. All of the ChatGPT suggestions did not work because the issue is within the discourse container where the app was built. I don't think ChatGPT would have ever recommended removing rbtrace from a production app, proving I am smarter than ChatGPT, hahahaha (that was an easy one).

Anyway, it was handy to run some ideas by ChatGPT during this debugging cycle.

1 Like

Checking in the container, nothing stands out as a production show-stopper:

root@discourse-socket1:/var/www/discourse# grep -r "rbtrace" . | grep -v vendor
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/libmsgpackc_ext.a: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/vrefbuffer.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/objectc.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/version.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/.libs/libmsgpackc.a: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/.libs/libmsgpack.a: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/unpack.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/zone.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/dst/lib/libmsgpackc.a: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/dst/lib/libmsgpack.a: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/lib/rbtrace.so: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/specifications/rbtrace-0.5.0.gemspec: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/extensions/x86_64-linux/3.2.0/rbtrace-0.5.0/rbtrace.so: binary file matches
./spec/rails_helper.rb:require "rbtrace" if RUBY_ENGINE == "ruby"
./script/memory-analysis:  `rbtrace -p #{pid} -e 'Thread.new{GC.start;require "objspace";io=File.open("#{dumpfile}", "w"); ObjectSpace.dump_all(output: io); io.close}'`
./script/memory-analysis:    STDERR.puts "Unable to find dumpfile #{dumpfile}, is rbtrace running properly, did you pick the right pid?"
./script/diff_heaps.rb:# rbtrace -p 15193 -e 'Thread.new{require "objspace"; ObjectSpace.trace_object_allocations_start; GC.start(full_mark: true); ObjectSpace.dump_all(output: File.open("heap.json","w"))}.join'
./config/application.rb:    require "rbtrace" if ENV["RBTRACE"] == "1"
./config/environments/development.rb:  require "rbtrace" if RUBY_ENGINE == "ruby"

Looks like rbtrace is only used in development and debugging; so we should be just fine without it.

2 Likes

I wouldn't suppose for one minute that anyone else has done the work you've just done. I also don't suppose that the Discourse team have even clocked that these update failures all involved Ruby Trace 0.5.x

If they had it would be easy enough to issue the temporary work-around to build without Ruby Trace, but they are unable to read! The error actually says the issue is with installing rbtrace. Perhaps rbtrace version upgrade now requires a slightly different data structure that they also haven't clocked.

Methinks they ain't too bright. Users can now find a (temporary) solution here if they search.

Merry Christmas!

2 Likes

fyi, google bard gave the following response to same:


- **Performance overhead:** Rbtrace introduces overhead due to its instrumentation and tracing activities. This can impact application response times and overall throughput, potentially degrading user experience.
- **Security risks:** Enabling tracing capabilities in production might expose sensitive information like method calls, arguments, and internal application logic, increasing potential attack vectors.
- **Stability concerns:** Rbtrace modifications to the Ruby runtime can introduce instability and unexpected behavior, especially in complex production environments.
- **Debugging focus:** Rbtrace is primarily designed for debugging and profiling in development and testing environments, where performance and security constraints are less stringent.

**Alternatives for production environments:**

- **Logging and monitoring tools:** Collect detailed performance and error data without runtime intrusion.
- **Application performance monitoring (APM) tools:** Provide in-depth analysis of application behavior and bottlenecks.
- **Dedicated profiling tools:** Designed for production profiling with minimal overhead, such as New Relic, Scout, or AppSignal.
- **Feature flags and controlled rollouts:** Isolate potential issues and release features safely in production.
- **Thorough testing and staging environments:** Identify and address performance issues before production deployment.

**When to consider rbtrace in production:**

- **Crucial diagnostic needs:** If you're facing a critical issue that can't be resolved through other means and requires deep inspection of runtime behavior.
- **Short-term, controlled usage:** Use it for brief periods with careful monitoring and under specific conditions.

**Best practices if using rbtrace in production:**

- **Minimize overhead:** Use selective tracing for specific methods or code blocks.
- **Monitor performance:** Closely track application performance and resource usage.
- **Ensure security:** Implement measures to protect sensitive data and mitigate risks.
- **Disable after debugging:** Remove rbtrace once the issue is resolved to avoid long-term impacts.
2 Likes
Here is one of the reasons why Discourse has this problem:

I think, and it correlates with my experience a long time ago at meta, is that the core team is too much "in love" with their worldview of tech.

Quoting from above....

rbtrace + stackprof
At Discourse we include rbtrace and stackprof in our Gemfile.

gem 'rbtrace'
gem 'stackprof', require: false
We always load up rbtrace in production, it allows us a large variety of production level debugging. stackprof is loaded on-demand.

So, they "always include rbtrace in production". but when it causes failures, they cannot figure out why or ask you what is your OS version, etc. They could simply have an option in the build config to omit it; but no; that would go against their "we are in love with ourselves" worldview. haha

Cheers.

Thanks.

Hahah... if I don't do the required analysis, who will ?

Hahah... if I don't do the required analysis, who will ? Certainly, no one will do it over at meta discourse. This, I am sure, and so that's why I stopped posting there. Too much ego and too much confirmation bias** and not enough technical depth.

As mentioned, those guys over there are too much in love with themselves to perform a true objective analysis of their own software, in my view. Long ago, I saw how their core team refused to listen to anyone (like me) who is more objective and less biased, and not "in love" with their stack. It's all about ego for many people

.

Oh Well..... at least we figured it out.

Strip out the unnecessary rbtrace and enjoy the holidays!

Thanks for all the ideas!

:slight_smile:

Footnote

** Confirmation bias is the tendency to search for, interpret, favor, and recall information in a way that confirms or supports one's prior beliefs or values.

3 Likes

Remove rbtrace once the issue is resolved...... don't force every Discourse admin to include it in production. At least provide an option in the build .yml file to exclude it.

There it is! Thanks.

PS: I have never had to use rbtrace for any production rails app and certainly never with the Discourse app.

4 Likes

FWIW...

I noticed that after rebuilding the container with rbtrace stripped out of the Gemfile, inside the container were gem lib files for rbtrace 0.5.0, I assume from the Discourse docker base image.

root@discourse-socket1:/var/www/discourse# grep -r "rbtrace" . | grep -v vendor
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/libmsgpackc_ext.a: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/vrefbuffer.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/objectc.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/version.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/.libs/libmsgpackc.a: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/.libs/libmsgpack.a: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/unpack.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/src/msgpack-1.1.0/src/zone.o: binary file matches
grep: ./vendor/bundle/ruby/3.2.0/gems/rbtrace-0.5.0/ext/dst/lib/libmsgpackc.a: binary file matches
...
...

So, I tried the same method as above, but this time specifying:

   - sed -i '/rbtrace/d' Gemfile
   - sed -i '/rbtrace/d' Gemfile.lock
   - echo "gem 'rbtrace', '= 0.5.0'" >> Gemfile

and it compiled with rbtrace as it should.

...
...
Using redis-namespace 1.11.0
Using pry 0.14.2
Using puma 6.4.0
Using net-http 0.4.0
Using rbtrace 0.5.0     # <--- here is it!  working when gem specified exact version 
Using oauth-tty 1.0.5
Using snaky_hash 2.0.1
Using rqrcode 2.2.0
Using rss 0.3.0
Using loofah 2.22.0
Using unicorn 6.1.0
Using net-imap 0.4.8
Using net-pop 0.1.2
Using net-smtp 0.4.0
...
...

This means that the problem is more than likely that rbtrace version 0.5.0 is included in the base discourse docker image (which is pulled when installing), but the discourse gem file does not specific that version number so it pulls version 0.5.1 causing a build error when bundling the gems. Fairly basic stuff.

So, this fix works (in the current discourse version) added to the build .yml file:

   - sed -i '/rbtrace/d' Gemfile
   - sed -i '/rbtrace/d' Gemfile.lock
   - echo "gem 'rbtrace', '= 0.5.0'" >> Gemfile

However, this fix will fail when discourse updates rbtrace in their docker images, unless they also update their Gemfile (or they get lucky and the version in their docker image matches the version requested in the gem file).

So, that is the full take on this error.

Successfully bootstrapped, to startup use ./launcher start socket3

So, this is a Discourse build error, for sure and nothing to do with the OS, docker, etc, in my view; because specifying the correct version in the Gemfile which matches the docker image pulled during install solves this problem.

Merry Christmas!

1 Like

It's noteworthy to add that I could not have found this "version issue" solution without first stripping out rbtrace from the bundle so I could enter the container to search for rbtrace references.

After doing so, I found that the discourse docker image pulled from the net (rbtrace v0.5.0) was not the same as in the gemfile bundle error (rbtrace v0.5.1).

So, there are some interesting lessons to be learned from this analysis.

1 Like

Just remind me, when did you say that you were joining the Discourse support team?

1 Like

Never in the karmic lifecycle as a stream of consciousness in this universe.

'Clowns to the left of me. Jokers to the right .... here I am ....."

:smiley:

2 Likes
This goes to confirm my earlier (harsh) statements about that group.

They are "all over the map" with ideas (asking irrelevant questions like OS version and docker versioning) ; but none of them could be bothered to take the time to actually figure out what the root cause of the problem was.

It does not trouble me that they are not very deep (and broad) technically speaking and exhibit poor troubleshooting and computer software analytical skills.

What causes me to pause is their arrogance in believing they are much more skillful than they actually are and how they degrade others (like me a few years ago over there at meta, when I was helping them out for free) who are more skillful with more experience.

Yes, it can be tricky analyzing a complex stack such as discourse; but when we see a bundle error with a Ruby gem, that is where we should search for the error, not in the OS or the container and other areas in the stack which are far away from the error condition.

In this case, the error was based on a mismatch between the rbtrace version compiled in the core discourse docker image (v0.5.0) and the improperly formatted Gemfile spec for rbtrace which tried to install version 0.5.1 during the final build process, throwing the error.

By stripping out rbtrace (which is not required for production) during the final build process, we can enter the container on a successful build and then search for references to rbtrace and do some basic analysis to see what is going on.

At first, I tried:

 - echo "gem 'rbtrace', '~> 0.5.0'" >> Gemfile

However, the above failed.

Then I tried the more exact form:

 - echo "gem 'rbtrace', '= 0.5.0'" >> Gemfile

and it built just fine.

When I saw this reply from a "support expert" at meta:

I was surprised to see someone trolling to profit off of an obvious discourse gem configuration bundle error.

This is why I am happy to be mostly retired.

Well, that is enough ranting over poor technical support over at Discourse. I also found similar behavior (poor support, arrogance, cluelessness) over at the OpenAI dev forums from so many "experts" who also could not do basic analysis to understand the stack they are debugging.

As the old saying goes by Erasmus...

At my age, I've grown weary of "one-eyed kings".

2 Likes

Just sucessfully spun-up another container:

Every 2.0s: docker ps                                                                    condor2769: Mon Dec 25 22:44:57 2023

CONTAINER ID        IMAGE                     COMMAND                  CREATED             STATUS              PORTS.          NAMES
271b865b2dea        local_discourse/socket2   "/sbin/boot"             2 minutes ago       Up 2 minutes.                       socket2
938f19881dbb        local_discourse/socket1   "/sbin/boot"             2 days ago          Up 2 days                           socket1

Using this line in the socket2.yml file:

    - exec:
        cd: $home
        cmd:
          - sed -i '/rbtrace/d' Gemfile
          - sed -i '/rbtrace/d' Gemfile.lock
          - echo "gem 'rbtrace', '= 0.5.0'" >> Gemfile
          - su discourse -c 'bundle config unset deployment'
          - su discourse -c 'bundle install --no-deployment --path vendor/bundle --jobs 4 --without test development'

So basically, the above is the fix for those who want to keep rbtrace in their app (using the current discourse docker image).

Cheers.

3 Likes

Looks like they closed this topic a few days ago when they were nowhere near a solution!

The word "clueless" comes to mind.

1 Like
It's a sign of the times.

ChatGPT would have given a better answer than those meta clown-experts, to be honest.

Check this out:

ChatGPT Reply

Me:

Help me debug this: Discourse Build Error: "An error occurred while installing rbtrace (0.5.1), and Bundler cannot continue."

ChatGPT:

The error message you provided indicates that there was an issue while trying to install the rbtrace gem during the build process of a Discourse application. Here are some steps you can take to debug and resolve this issue:

1: Check the Gemfile.lock: Start by examining the Gemfile.lock in your Discourse project. This file specifies the exact versions of each gem required for your application. Look for the rbtrace gem and its version. It's possible that the version specified in the Gemfile.lock is causing conflicts.

2: Update Ruby and Bundler: Ensure that you are using a compatible version of Ruby and Bundler. Discourse usually specifies the recommended versions in their documentation. You can update Ruby using a version manager like RVM or rbenv, and you can update Bundler using the following command: .....

3: Retry Installation: After updating Ruby and Bundler, try running the installation process again: ......

4: Check Dependencies: Verify that your system has all the necessary dependencies for building gems. Some gems require native extensions that depend on system libraries. Ensure that you have the required development packages installed on your system.

5: Explicitly Specify Gem Versions: If the issue persists, you can try explicitly specifying the version of the rbtrace gem in your Gemfile . Add the following line to your Gemfile ........

....... etc etc

ChatGPT: Debugging Discourse: Gem Installation

ChatGPT's reply would not have solved the problem, but it was at least in the right part of the stack and ChatGPT was correct there was a gem version-conflict issue in the bundle process.

However, I had to strip out the gem first so I could build (bundle) and enter the container to see what version was in the base docker image and then modify the gemfile to insure there was a version match.

1 Like

To be complete, Discourse is aware of this buggy rbtrace issue, but they have failed to properly address the issue (see commit history below) due to a misalignment (divergence) of the current discourse docker image and changes to the discourse code base.

https://github.com/search?q=repo%3Adiscourse%2Fdiscourse++rbtrace&type=commits

Yes, I get that BUT they have failed to comply with some change, whatever it may be, between rbtrace versions. Users have already reported issues and they've failed to clock it

I'm no expert on this topic but I reckon that they are about to face an avalanche of support requests on this very issue that they are presently blind to.

1 Like

I found it interesting that ChatGPT did a much better job debugging this issue that the "experts" at discourse meta.

As I have written in other posts, I am not a "kool-aid drinker" with regard to generative AI; but when ChatGPT out performs humans, that really says something about the skill sets (the lack of skill sets) of the humans.

I mean, it says a lot when someone who is supposed to be a subject matter expert is less skillful than ChatGPT (a text auto-completion, prediction engine) in the same subject matter.

!!!
1 Like

thankfully (it would seem) that the AI bots have yet to acquire that most human of attributes - hubris .

2 Likes