Unfortunately, I don’t think that any method yet proposed can be used to make a reasonable claim of convergence. Convergence diagnostics can only tell you that a chain has not converged. That being said, you clearly need some criteria to stop sampling.
Again, we did not use TGR-PSRF values as a stopping criteria and I don’t have an understanding yet of what a specific threshold would imply (if anything). A value of 1.1 does seem low and a threshold of 1.01 does seem reasonable just looking at the values we obtained on our datasets. The downside to fixing a threshold for PSRF is that you would be the first to do so and may either (1) get a false sense of security with a threshold that is too high or (2) require an excessive amount of computation with a threshold that is too low or even unobtainable. I want to make it clear for anyone else reading this that TGR-PSRF is a newly proposed idea and that I don’t recommend it be used as a sole measure of convergence (which you did not suggest).
There are two observations we made in our paper with respect to convergence criteria that you may wish to consider. First, how many independent chains are you using? As far as I can tell this has not been discussed much previously, but we observed that an ASDSF threshold of 0.01 is much stricter with three or four independent chains (runs in MrBayes parlance) instead of two. In our tests on peaky datasets we found that many tests with two runs would fail to converge using an ASDSF criteria simply because the chains happened to sample the same peak more often than not. You may find that ASDSF gives you a better indicator of convergence with 3 or 4 runs if you have only been using 2 (which is the default in MrBayes).
Second, have you sampled your problematic datasets multiple times? If you obtain similar split frequencies from 3 or 4 completely independent MCMC executions with random starting points then that is a better indicator of convergence than any diagnostics on a single MCMC execution.
Finally, have you tried other methods of looking at convergence? For example, looking at PSRF values for split frequencies or graphing split frequencies with AWTY? In your potentially problematic datasets, does the ASDSF drop smoothly or bounce around? The latter could indicate a peaky dataset, particularly if the results are not consistent between multiple MrBayes executions.