Differences

This shows you the differences between two versions of the page.

Link to this comparison view

irc:1469052000 [2017/05/27 13:44] (current)
Line 1: Line 1:
 +[09:12:46] *** ChanServ sets mode: +o temporalfox
 +
 +[12:21:26] *** ChanServ sets mode: +o temporalfox
 +
 +[17:10:25] *** ChanServ sets mode: +o temporalfox
 +
 +[22:52:01] *** ChanServ sets mode: +o temporalfox
 +
 +[23:28:14] <​temporalfox>​ hi AlexLehm
 +
 +[23:41:06] <​AlexLehm>​ Hello temporalfox
 +
 +[23:41:21] <​temporalfox>​ I've made progress on the netty 4.1.3.Final OOM isue
 +
 +[23:41:29] <​temporalfox>​ now I know more or less what it is
 +
 +[23:41:41] <​temporalfox>​ basically it's not a leak
 +
 +[23:41:59] <​temporalfox>​ The error happening is not a raw out of memory but rather an indication than the VM spend more CPU in the GC that in the application (98%).
 +
 +[23:42:13] <​temporalfox>​ This happens when weak references are used
 +
 +[23:42:38] <​temporalfox>​ Netty has a Recycler class that recycles pooled objects
 +
 +[23:42:58] <​temporalfox>​ normally a thread recycles an object directly into its pool
 +
 +[23:43:41] <​temporalfox>​ when an object is recycled from another thread, this object is stored in a weakhashmap<​Stack,​ WeakOrderQueue>​
 +
 +[23:44:11] <​temporalfox>​ and later these objects are pulled when the Recycler needs objects (from the thread the recycler belongs to)
 +
 +[23:44:37] <​temporalfox>​ it turns that the weakhashmap contains references to eventloop thread (VertxThread)
 +
 +[23:44:54] <​temporalfox>​ and recently a commit changed the behavior of the recycler
 +
 +[23:45:06] <​temporalfox>​ so more objects are in weakhashmap
 +
 +[23:45:11] <​temporalfox>​ and thus more threads
 +
 +[23:45:30] <​temporalfox>​ and it turns that during tests it keeps a lof of thread from being garbaged
 +
 +[23:45:35] <​temporalfox>​ specially in slow machines
 +
 +[23:45:46] <​temporalfox>​ where allocation is faster than GC
 +
 +[23:45:55] <​temporalfox>​ hence the specific GC issue
 +
 +[23:46:41] <​temporalfox>​ this commit is the change https://​github.com/​netty/​netty/​commit/​afafadd3d7caf1e4b346da049baab0afeae0a4bc
 +
 +[23:47:05] <​temporalfox>​ I'm trying to figure out what is the correct thing to do
 +
 +[23:47:19] <​temporalfox>​ it may simply be a more appropriate GC setting for tetss
 +
 +[23:47:25] <​temporalfox>​ or change the recycler parameters
 +
 +[23:47:29] <​temporalfox>​ I don't know yet :-)
 +
 +[23:48:19] <​temporalfox>​ in the Recycler class there is
 +
 +[23:48:26] <​AlexLehm>​ is that an issue that mostly happens in tests since there more objects are created and recyled?
 +
 +[23:48:32] <​temporalfox>​ yes
 +
 +[23:48:33] <​AlexLehm>​ or would that happen in real applications as well?
 +
 +[23:48:41] <​temporalfox>​ no I don't think so
 +
 +[23:48:43] <​temporalfox>​ DELAYED_RECYCLED
 +
 +[23:48:43] <​AlexLehm>​ ok
 +
 +[23:48:53] <​temporalfox>​ this is the thread local weak hashmap
 +
 +[23:50:12] <​temporalfox>​ it's mainly because we create and destroy many Vertx instances
 +
 +[23:50:14] <​temporalfox>​ in tests
 +
 +[23:51:13] <​temporalfox>​ and it happens because of the ThreadDeathWatcher
 +
 +[23:51:32] <​temporalfox>​ that schedule tasks when some thread dies
 +
 +[23:52:01] <​temporalfox>​ with PoolThreadCache
 +
 +[23:52:25] <​temporalfox>​ so the ThreadDeathWatcher fast thread local contains a weakhashmap that grows
 +
 +[23:52:35] <​temporalfox>​ and weak retains the vertx threads
 +
 +[23:52:56] <​temporalfox>​ with the new change this map reach around 2000 elements
 +
 +[23:53:02] <​temporalfox>​ and without it remains quite low