Wonderful.
About the invalidate problem, you can for example call Invalidate of the container.
It can be optimised to use OpenGL speedup. In FXDraw, the content of fx (TBGLBitmap) variable is changed each time even if the bitmap FBGRA is not updated. So image data is transmitted to video memory each time. But that's not necessary because the content does not always change.
Instead you can use TBGLBitmap only. Indeed you can draw a TBGLBitmap on a regular canvas.
Otherwise, if you would like your control to work also without the container, with the regular canvas and without OpenGL, you can continue using TBGRABitmap, but not TBGLBitmap. Only keep a texture variable. When the bitmap is changed, for example set the texture to nil, so you know you need to create it again in case you draw it in the container. The texture can be created using BGLTexture(...). At that time it will be transferred to video memory.
When computing the shadow you can use rbBox:
BGRAReplace(FBGRAShadow, FBGRAShadow.FilterBlurRadial(FShadowSize/sqrt(2),
FShadowSize/sqrt(2), rbBox) as TBGRABitmap);
Here attached is a patch example. Note that there may be some problems when the texture is freed, in particular if there are multiple OpenGL contexts. It would be safe to add some FXFree procedure that would be called when the context is freed. The container could call those functions. Note that the function MakeCurrent of the OpenGL context must be called to ensure the correct context is selected.
For the background gradient of the container, as OpenGL is fast for gradients, you can simply do:
BGLCanvas.FillRectLinearColor(0,0,FXContainer1.Width,FXContainer1.Height,BGRABlack,BGRABlack,BGRAWhite,BGRAWhite);