Differences

This shows you the differences between two versions of the page.

Link to this comparison view

irc:1433282400 [2017/05/27 13:44] (current)
Line 1: Line 1:
 +[11:06:28] <​temporalfox>​ pmlopes cescoffier good morning :-)
 +
 +[11:06:37] <​pmlopes>​ good morning
 +
 +[11:06:45] <​cescoffier>​ morning !
 +
 +[11:21:41] <​andyhedges>​ I pretty sure I know the answer to this question. I'm going to ask anyway, should it always be safe to write a buffer to a http response using end(buff), when the buffer was created from a byte array.
 +
 +[11:28:25] <​temporalfox>​ andyhedges why could it be a problem ?
 +
 +[11:29:50] <​andyhedges>​ Because it is a problem ;) On AWS using Amazon'​s Linux it goes into an infinite pause
 +
 +[11:30:12] <​andyhedges>​ Works find on my Mac, works find on Azure with CentOS
 +
 +[11:30:18] <​andyhedges>​ Even works on Windows
 +
 +[11:31:13] <​andyhedges>​ but on AWS I can send it into an infinite pause fairly predicatably
 +
 +[11:33:00] <​andyhedges>​ 2015-06-03 08:​03:​20,​315 341555 WARN  [vertx-blocked-thread-checker] i.v.core.impl.BlockedThreadChecker - Thread Thread[vert.x-eventloop-thread-0,​5,​main] has been blocked for 162480 ms, time limit is 500
 +
 +[11:33:02] <​andyhedges>​ io.vertx.core.VertxException:​ Thread blocked
 +
 +[11:33:04] <​andyhedges> ​ at io.vertx.core.http.impl.HttpServerResponseImpl.handleDrained(HttpServerResponseImpl.java:​447) ~[[redacted].jar:​na]
 +
 +[11:33:06] <​andyhedges> ​ at io.vertx.core.http.impl.ServerConnection.handleInterestedOpsChanged(ServerConnection.java:​295) ~[[redacted].jar:​na]
 +
 +[11:33:08] <​andyhedges>​
 +
 +[11:33:45] <​andyhedges>​ Any ideas welcome :)
 +
 +[11:45:59] <​andyhedges>​ Just confirms it works fine on RHEL on AWS too - so just Amazon'​s Linux on AWS - grrr
 +
 +[11:46:06] <​andyhedges>​ confirmed*
 +
 +[11:48:49] <​Sticky>​ I would do a thread dump and look for deadlocks/​livelocks
 +
 +[11:49:27] <​andyhedges>​ Will take a look Sticky, thanks
 +
 +[11:49:27] <​temporalfox>​ andyhedges what if you clone the byte[] ?
 +
 +[11:50:20] <​andyhedges>​ so Buffer.buffer(b.clone())
 +
 +[11:50:33] <​temporalfox>​ to see what happens
 +
 +[11:51:56] <​temporalfox>​ at the end creating a Buffer from a String, calls getBytes() on the String
 +
 +[11:57:43] <​purplefox>​ andyhedges: the blocked thread checker should give you a stack trace telling you where the blocking is occurring
 +
 +[11:59:19] <​andyhedges>​ Yes, I pasted it above
 +
 +[11:59:24] <​andyhedges>​ it' in the handleDrained
 +
 +[11:59:33] <​andyhedges>​ the clone didn't improve matters btw
 +
 +[11:59:52] <​andyhedges>​ HttpServerResponseImpl.java:​447
 +
 +[11:59:57] <​andyhedges>​ is where it is blocked
 +
 +[12:00:24] <​andyhedges>​ or more helpfully io.vertx.core.http.impl.HttpServerResponseImpl.handleDrained(HttpServerResponseImpl.java:​447)
 +
 +[12:05:04] <​purplefox>​ andyhedges: can you show me the full stack - there'​s nothing in that method that blocks
 +
 +[12:05:17] <​purplefox>​ so unless you're doing a busy wait...
 +
 +[12:05:33] <​andyhedges>​ full stack from the BlockedThreadChecker?​
 +
 +[12:05:36] <​purplefox>​ yes
 +
 +[12:05:48] <​andyhedges>​ I'll pastebin it, one sec
 +
 +[12:07:49] <​purplefox>​ and do you get the same stack each time?
 +
 +[12:08:01] <​andyhedges>​ Yes, same each time
 +
 +[12:08:05] <​andyhedges>​ http://​pastebin.com/​RdakrE7N
 +
 +[12:08:35] <​purplefox>​ andyhedges: what version are you using?
 +
 +[12:08:51] <​andyhedges>​ milestone6
 +
 +[12:09:25] <​purplefox>​ ok, the method is synchronized,​ so i suspect you have a deadlock
 +
 +[12:09:32] <​purplefox>​ can you do a killall -3 java when this occurs?
 +
 +[12:09:37] <​purplefox>​ this should give you more information
 +
 +[12:09:42] <​andyhedges>​ lemme try
 +
 +[12:09:50] <​andyhedges>​ what would it deadlock with, any idea?
 +
 +[12:10:15] <​purplefox>​ don't know, but the dump should tell us
 +
 +[12:12:05] <​andyhedges>​ Did the killall, was expecting output on standard out, but nothing
 +
 +[12:12:41] <​purplefox>​ killall -3 ?
 +
 +[12:12:46] <​andyhedges>​ yup
 +
 +[12:12:59] <​purplefox>​ what jdk are you using?
 +
 +[12:13:40] <​andyhedges>​ OpenJDK 1.8.0_45-b13
 +
 +[12:14:03] <​purplefox>​ weird, that should certainly work
 +
 +[12:14:20] <​purplefox>​ you could kill -3 <pid>
 +
 +[12:14:29] <​purplefox>​ where <pid> is the pid of the process
 +
 +[12:14:37] <​purplefox>​ or kill -QUIT (but that means the same thing)
 +
 +[12:14:38] <​andyhedges>​ will try
 +
 +[12:14:50] <​purplefox>​ it's just a standard way of getting a dump
 +
 +[12:15:04] <​purplefox>​ i assume you are on linux>
 +
 +[12:15:05] <​purplefox>​ ?
 +
 +[12:15:28] <​purplefox>​ maybe your process is not called "​java"​
 +
 +[12:15:48] <​andyhedges>​ it's called java8 but I fixed that
 +
 +[12:15:58] <​andyhedges>​ also tried with the PID, strange
 +
 +[12:16:08] <​andyhedges>​ just googling for what might cause this
 +
 +[12:16:37] <​purplefox>​ try kill -9 <pid> and tell me what happens?
 +
 +[12:17:01] <​andyhedges>​ just got one with jstack if you are interested
 +
 +[12:17:30] <​purplefox>​ sure, however you get it, doesn'​t matter ;)
 +
 +[12:17:49] <​andyhedges>​ :)
 +
 +[12:18:03] <​purplefox>​ maybe you don't have a console attached to the process
 +
 +[12:18:13] <​purplefox>​ kill -3 outputs to the console
 +
 +[12:20:02] <​andyhedges>​ http://​pastebin.com/​SqEwZxHe
 +
 +[12:20:18] <​Sticky>​ "Found 1 deadlock"​
 +
 +[12:20:26] <​andyhedges>​ Indeed, but why
 +
 +[12:20:54] <​andyhedges>​ I'm checking the code, probs something embrassing I did :S
 +
 +[12:21:04] <​purplefox>​ you have two event loops deadlocking - this should never happen. can you provide a reproducer?
 +
 +[12:21:41] <​Sticky>​ andyhedges: no, this is not your fault
 +
 +[12:21:50] <​andyhedges>​ I can only reproduce on AWS, but I'll try and pull together some code to do so tonight
 +
 +[12:21:55] <​purplefox>​ in normal use an httpserverresponse should only be accessed by the same event loop
 +
 +[12:22:04] <​purplefox>​ but here it is being accessed by two different ones
 +
 +[12:22:30] <​andyhedges>​ There'​s only one Verticle with one HttpServer in it
 +
 +[12:22:42] <​purplefox>​ one verticle instance?
 +
 +[12:22:57] <​andyhedges>​ Yep, spawned from main
 +
 +[12:23:05] <​andyhedges>​ lemme double check that
 +
 +[12:23:10] <​purplefox>​ very strange
 +
 +[12:23:25] <​purplefox>​ I need to go into a meeting soon, but if you could create a reproducer we can take a look
 +
 +[12:23:37] <​andyhedges>​ Will do, I have meeting too :'(
 +
 +[12:23:47] <​andyhedges>​ Will get something together as soon as I can
 +
 +[12:23:59] <​andyhedges>​ Post to eclipse bugzilla?
 +
 +[12:24:48] <​purplefox>​ if you can push something to github that would be ideal
 +
 +[12:25:12] <​purplefox>​ btw.. i remember someone posted almost identical issue recently on the google group
 +
 +[12:30:37] <​andyhedges>​ OK, will take a look
 +
 +[12:30:42] <​andyhedges>​ Will github it, sure.
 +
 +[12:32:09] <​aesteve>​ yikes, I was about to push my beta app to AWS this evening. I will keep an eye on it, too
 +
 +[12:32:40] <​andyhedges>​ The other guy isn't using AWS Linux fwiw
 +
 +[12:32:47] <​andyhedges>​ from the google group
 +
 +[12:32:56] <​andyhedges>​ perhaps this just makes it happen faster
 +
 +[12:33:07] <​andyhedges>​ his manifests after 12 hours
 +
 +[12:33:15] <​andyhedges>​ mind I can do with a few calls to http
 +
 +[12:33:19] <​andyhedges>​ mine*
 +
 +[12:54:18] <​Sticky>​ it is almost certainly about the speed of the machine/​number of cores
 +
 +[12:54:30] <​Sticky>​ that make the deadlock more likely
 +
 +[12:55:05] <​Sticky>​ probably nothing specifically AWS linux related about it
 +
 +[14:33:06] <​andyhedges>​ I agree
 +
 +[14:33:16] <​andyhedges>​ although all the cloud machines are single core
 +
 +[14:47:53] <​purplefox>​ pmlopes: temporalfox cescoffier hi folks
 +
 +[14:48:02] <​cescoffier>​ Hi purplefox
 +
 +[14:48:08] <​cescoffier>​ how was Newcastle ?
 +
 +[14:48:10] <​pmlopes>​ hi purplefox
 +
 +[14:48:17] <​purplefox>​ been having an extremely hectic few days, but hopefully things will be back to normal soon
 +
 +[14:48:25] <​purplefox>​ newcastle was windy ;)
 +
 +[14:49:36] <​purplefox>​ so.. how it going with you guys?
 +
 +[14:50:46] <​cescoffier>​ smoothly on my side. Had fixed a couple of thinks in the javascript generation (the semi-colon) and in the doc gen
 +
 +[14:51:09] <​cescoffier>​ right now, I'm writting the docker manual with all the required content
 +
 +[14:51:15] <​cescoffier>​ should be done tonight
 +
 +[14:51:19] <​purplefox>​ great
 +
 +[14:51:39] <​cescoffier>​ tomorrow will focus on the core documentation
 +
 +[14:52:40] <​cescoffier>​ anything urgent I need to do in the meantime ?
 +
 +[14:54:56] <​cescoffier>​ I've to sync with temporalfox,​ but probably friday I will run a release dry run - to be sure everything works smoothly
 +
 +[14:55:06] <​purplefox>​ we should have a meeting soon to discuss what remains to be done
 +
 +[14:56:03] <​purplefox>​ but for now, examples, docs, docker, openshift that's all good
 +
 +[14:56:37] <​cescoffier>​ yep
 +
 +[14:57:07] <​cescoffier>​ and everything will be documented in a central place (the manual I'm writting) as well as the fabric 8 metadata, ruby, js and groovy examples
 +
 +[14:57:14] <​cescoffier>​ and a distributed applicaiton example too
 +
 +[14:57:32] <​cescoffier>​ however, the distributed app is _cheating_ right now
 +
 +[14:58:22] <​purplefox>​ cool
 +
 +[14:58:30] <​cescoffier>​ as I'm on mac, I'm using the boot2docker VM and multicast is working. I've also a working example with unicast, but would like to try it on a true distributed environment (with several machine running there own docker containers)
 +
 +[14:58:40] <​cescoffier>​ waiting to get my machine to do that ;-)
 +
 +[14:58:56] <​purplefox>​ pmlopes: hi paulo, how are you?
 +
 +[14:59:45] <​pmlopes>​ purplefox, i am fine, just spent almost all morning fighting with the tokens and kerberos but i am all set up now
 +
 +[15:00:06] <​cescoffier>​ pmlopes : did you use the google auth way too ?
 +
 +[15:01:06] <​pmlopes>​ no, i got some troubles with it so i just picked a stare yubikey that i have here
 +
 +[15:01:26] <​pmlopes>​ and it works perfect, just tap and i am in
 +
 +[15:02:19] <​purplefox>​ ok folks so regarding the work to do for 3.0, it's mainly just docs, examples, website and fixing bugs
 +
 +[15:02:56] <​cescoffier>​ we should sync on the doc writing
 +
 +[15:03:18] <​purplefox>​ yeah
 +
 +[15:03:45] <​purplefox>​ temporalfox:​ hi julien, are you there?
 +
 +[15:04:01] <​purplefox>​ there are a few holes in the docs right now
 +
 +[15:07:50] <​purplefox>​ bbiab
 +
 +[15:08:21] <​cescoffier>​ I've made two PR about the doc last week (https://​github.com/​eclipse/​vert.x/​pulls/​cescoffier)
 +
 +[15:09:16] <​cescoffier>​ (don't ask why one pass the CLA validation and not the other one, while both has the same email....)
 +
 +[15:09:44] <​temporalfox>​ purplefox hi
 +
 +[15:10:02] <​purplefox>​ hi julien, how are things?
 +
 +[15:10:47] <​temporalfox>​ things are doing ok :-)
 +
 +[15:13:50] <​purplefox>​ thanks for doing the release
 +
 +[15:13:53] <​temporalfox>​ purplefox how is it going for you ?
 +
 +[15:18:03] <​purplefox>​ temporalfox:​ the last few days has been disruptive but i am looking forward to getting stuff done for the rest of the week
 +
 +[15:59:59] <​pmlopes>​ purplefox, temporalfox:​ i've completed an example of mongo, web and jade templates, to which repo should i upload?
 +
 +[16:00:19] <​purplefox>​ vertx-examples
 +
 +[16:00:39] <​purplefox>​ just send a PR to that :)
 +
 +[16:02:18] <​pmlopes>​ humm... that means that i need to integrate it with the examples runner, right?
 +
 +[16:03:04] <​temporalfox>​ pmlopes is it a vertx CLI example ?
 +
 +[16:03:21] <​pmlopes>​ no, it is a web app
 +
 +[16:03:29] <​temporalfox>​ ok
 +
 +[16:03:51] <​temporalfox>​ I'll try to make something that generates CLI examples from examples
 +
 +[16:04:01] <​temporalfox>​ so we can use them to test the CLI version easily
 +
 +[16:08:25] <​purplefox>​ ?
 +
 +[16:08:39] <​purplefox>​ all examples can be run at the command line too
 +
 +[16:09:06] <​purplefox>​ not sure i understand the issue here
 +
 +[16:09:08] <​temporalfox>​ how do you do that ?
 +
 +[16:09:27] <​purplefox>​ cd <example dir>
 +
 +[16:09:28] <​temporalfox>​ I haven'​t tried actually
 +
 +[16:09:32] <​purplefox>​ vertx run <example name>
 +
 +[16:09:33] <​temporalfox>​ does it work for non java ?
 +
 +[16:09:46] <​purplefox>​ yes, it even explains this in the readme
 +
 +[16:09:51] <​temporalfox>​ ok cool :-)
 +
 +[16:10:05] <​purplefox>​ that's kind of the point of the examples
 +
 +[16:10:15] <​temporalfox>​ I thought they were IDE only :-)
 +
 +[16:10:31] <​temporalfox>​ in M5 for instance, we haven'​t bundled the various template engine dependencies in the distrib
 +
 +[16:10:37] <​purplefox>​ temporalfox:​ rtfm ;)
 +
 +[16:10:46] <​temporalfox>​ there are so many manual to read :-)
 +
 +[16:11:01] <​temporalfox>​ cescoffier is even writing a new one today :-)
 +
 +[16:11:09] <​purplefox>​ it's just the main README on the examples project
 +
 +[16:12:19] <​cescoffier>​ in my case it's quite easy, I'm in docker ;-)
 +
 +[16:12:40] <​cescoffier>​ in or on top or against.... depends on my mood
 +
 +[16:35:32] <​aesteve>​ pmlopes: are you using an embedded mongo database for your example ?
 +
 +[16:35:59] <​pmlopes>​ aesteve: no i run a local mongo...
 +
 +[16:36:46] <​aesteve>​ ok I was wondering if I could submit my own example too, but was afraid I couldn'​t because Redis & Mongo aren't embedded
 +
 +[16:37:00] <​aesteve>​ and I wasn't sure it could be run :(
 +
 +[16:37:48] <​purplefox>​ why not use an embedded one?
 +
 +[16:38:19] <​aesteve>​ cause I only though about it recently ;)
 +
 +[16:38:24] <​pmlopes>​ aesteve: you're right it should start an embedded mongo to avoid external dependencies
 +
 +[16:38:42] <​aesteve>​ in my case I only need a Redis one
 +
 +[16:42:54] <​aesteve>​ s/only/aslo
 +
 +[17:44:11] <​andyhedges>​ Another question, is a callback handler always excuted on the same thread that passed it?
 +
 +[18:37:49] <​temporalfox>​ andyhedges yes, unless it's a non vertx thread
 +
 +[18:38:48] <​temporalfox>​ for instance in a JUnit test or embedded in a mail, that won't be the same thread
 +
 +[18:40:36] <​temporalfox>​ andyhedges here is an example https://​github.com/​vietj/​vertx-materials/​blob/​master/​src/​main/​asciidoc/​Demystifying_the_event_loop.adoc#​embedding-vertx
 +
 +[18:43:37] <​andyhedges>​ Even a vertx-mongo-client call back
 +
 +[18:43:54] <​andyhedges>​ will read the link
 +
 +[18:49:31] <​temporalfox>​ andyhedges for proxies like mongo-client it will depends on the context
 +
 +[18:49:56] <​temporalfox>​ well with mongo-client it's not a proxy
 +
 +[18:50:08] <​temporalfox>​ mongo client uses executeBlocking under
 +
 +[18:50:18] <​temporalfox>​ so read the executeBlocking section :-)
 +
 +[18:50:38] <​temporalfox>​ if you use mongo-service it will be the event bus
 +
 +[19:24:52] <​andyhedges>​ so I'm using vertx-mongo-client and so I think you are saying the the callback could be on a different thread, is that right, going to read that section now, thanks for the help :)
 +
 +[19:33:38] <​andyhedges>​ OK, so I've just proved to myself they do operate on different threads
 +
 +[19:33:50] <​andyhedges>​ and I'm assuming that's the desired behaviour
 +
 +[19:34:03] <​andyhedges>​ so I've got to read you doco in more detail now...
 +
 +[19:34:08] <​andyhedges>​ your*
 +
 +[21:23:29] <​AlexLehm>​ if I have context problems in a unit test, does it make sense to fix the unit test or is this an issue that could happen in "​real"​ uses as well