Buscar este blog

jueves, 24 de septiembre de 2015

JBoss Domain - Could not connect to master - Unable to connect due to authentication failure

Recently I ran in a problem related to a JBoss Domain which spans across two different physical hosts.

First, I configured a JBoss Domain in just one machine with a Domain Controller (DC) and two Host Controllers (HC). Everything was working fine, so I decided to go one step further and to configure two more HCs in another machine.
The procedure seemed quite obvious, and I cloned the machine one and just changed networks interfaces in this new machine. The rest of the configurations remained the same, i.e, host.xml in HC, where the secret key was configured in order to connect to DC.

But this didn't work. When I started the HC in machine 2 I got the following error:
[root@localhost bin]# ./domain.sh
=========================================================================

  JBoss Bootstrap Environment

  JBOSS_HOME: /opt/jboss-eap-6.2-hc1

  JAVA: java

  JAVA_OPTS: -Xms64m -Xmx512m -XX:MaxPermSize=256m -Djava.net.preferIPv4Stack=true -Djboss.modules.system.pkgs=org.jboss.byteman -Djava.awt.headless=true

=========================================================================

09:23:02,219 INFO  [org.jboss.modules] (main) JBoss Modules version 1.3.0.Final-redhat-2
09:23:02,432 INFO  [org.jboss.as.process.Host Controller.status] (main) JBAS012017: Starting process 'Host Controller'
[Host Controller] 09:23:03,382 INFO  [org.jboss.modules] (main) JBoss Modules version 1.3.0.Final-redhat-2
[Host Controller] 09:23:03,560 INFO  [org.jboss.msc] (main) JBoss MSC version 1.0.4.GA-redhat-1
[Host Controller] 09:23:03,667 INFO  [org.jboss.as] (MSC service thread 1-1) JBAS015899: JBoss EAP 6.2.0.GA (AS 7.3.0.Final-redhat-14) starting
[Host Controller] 09:23:04,778 INFO  [org.xnio] (MSC service thread 1-2) XNIO Version 3.0.7.GA-redhat-1
[Host Controller] 09:23:04,791 INFO  [org.xnio.nio] (MSC service thread 1-2) XNIO NIO Implementation Version 3.0.7.GA-redhat-1
[Host Controller] 09:23:04,830 INFO  [org.jboss.remoting] (MSC service thread 1-2) JBoss Remoting version 3.2.18.GA-redhat-1
[Host Controller] 09:23:04,940 INFO  [org.jboss.as.remoting] (MSC service thread 1-2) JBAS017100: Listening on 192.168.56.202:9999
[Host Controller] 09:23:05,646 ERROR [org.jboss.remoting.remote.connection] (Remoting "localhost.localdomain:MANAGEMENT" read-1) JBREM000200: Remote connection failed: javax.security.sasl.SaslException: Authentication failed: all available authentication mechanisms failed
[Host Controller] 09:23:05,653 ERROR [org.jboss.as.host.controller] (Controller Boot Thread) JBAS010901: Could not connect to master. Aborting. Error was: java.lang.IllegalStateException: JBAS010942: Unable to connect due to authentication failure.
[Host Controller] 09:23:05,665 INFO  [org.jboss.as.controller] (MSC service thread 1-2) JBAS014774: Service status report
[Host Controller] JBAS014775:    New missing/unsatisfied dependencies:
[Host Controller]       service jboss.server.controller.management.security_realm.ApplicationRealm.properties_authentication (missing) dependents: [service jboss.server.controller.management.security_realm.ApplicationRealm]
[Host Controller]
[Host Controller] 09:23:05,692 INFO  [org.jboss.as.controller] (MSC service thread 1-2) JBAS014774: Service status report
[Host Controller] JBAS014776:    Newly corrected services:
[Host Controller]       service jboss.server.controller.management.security_realm.ApplicationRealm.properties_authentication (no longer required)
[Host Controller]
[Host Controller] 09:23:05,695 INFO  [org.jboss.as] (MSC service thread 1-2) JBAS015950: JBoss EAP 6.2.0.GA (AS 7.3.0.Final-redhat-14) stopped in 27ms
09:23:06,045 INFO  [org.jboss.as.process.Host Controller.status] (reaper for Host Controller) JBAS012010: Process 'Host Controller' finished with an exit status of 99
09:23:06,051 INFO  [org.jboss.as.process] (Thread-8) JBAS012016: Shutting down process controller
09:23:06,052 INFO  [org.jboss.as.process] (Thread-8) JBAS012015: All processes finished; exiting

I don´t know why, but when DC and HC are in the same machine, in the host.xml of HC, you can put any slave name you want. But when DC and HC are in different machines, the slave name must be a management user in DC,

So what I did was:
1) Create a management user in HC, for example "adminHostController003".
    You have to launch add-user.sh script y HC server.
    This user will have the following characteristics:
  • Management user (in ManagementRealm)
  • No roles
  • Allow remoting connections
2) Copy the secret key and edit host.xml in Host Controller JBoss. 
   In this file you have to configure:
  • Host Name. The name of the user created in step 1
  • Server identities. The key asociated to this user.
  • Domain controller location. IP and port of DC
<host name="adminHostController003" xmlns="urn:jboss:domain:1.5">

 <management>
  <security-realms>
   <security-realm name="ManagementRealm">
    <server-identities>
     <secret value="Y2l4dGVjLjIwMTU="/>
    </server-identities>
    <authentication>                  
     <properties path="mgmt-users.properties" relative-to="jboss.domain.config.dir"/>
    </authentication>
    <authorization map-groups-to-roles="false">
     <properties path="mgmt-groups.properties" relative-to="jboss.domain.config.dir"/>
    </authorization>
   </security-realm>
   <security-realm name="ApplicationRealm">
    <authentication>
     <local default-user="$local" allowed-users="*"/>
     <properties path="application-users.properties" relative-to="jboss.domain.config.dir"/>
    </authentication>
    <authorization>
     <properties path="application-roles.properties" relative-to="jboss.domain.config.dir"/>
    </authorization>
   </security-realm>
  </security-realms>
  <audit-log>
   <formatters>
    <json-formatter name="json-formatter"/>
   </formatters>
   <handlers>
    <file-handler name="host-file" formatter="json-formatter" path="audit-log.log" relative-to="jboss.domain.data.dir"/>
    <file-handler name="server-file" formatter="json-formatter" path="audit-log.log" relative-to="jboss.server.data.dir"/>
   </handlers>
   <logger log-boot="true" log-read-only="false" enabled="false">
    <handlers>
     <handler name="host-file"/>
    </handlers>
   </logger>
   <server-logger log-boot="true" log-read-only="false" enabled="false">
    <handlers>
     <handler name="server-file"/>
    </handlers>
   </server-logger>
  </audit-log>
  <management-interfaces>
   <native-interface security-realm="ManagementRealm">
    <socket interface="management" port="${jboss.management.native.port:9999}"/>
   </native-interface>
  </management-interfaces>
 </management>

 <domain-controller>
  <remote host="192.168.56.101" port="9999" security-realm="ManagementRealm"/>
 </domain-controller>

 <interfaces>
  <interface name="management">
   <inet-address value="${jboss.bind.address.management:192.168.56.202}"/>
  </interface>
  <interface name="public">
   <inet-address value="${jboss.bind.address:192.168.56.202}"/>
  </interface>
  <interface name="unsecure">
   <inet-address value="${jboss.bind.address.unsecure:192.168.56.202}"/>
  </interface>
 </interfaces>

 <jvms>
  <jvm name="default">
   <heap size="64m" max-size="256m"/>
   <permgen size="256m" max-size="256m"/>
   <jvm-options>
    <option value="-server"/>
   </jvm-options>
  </jvm>
 </jvms>

 <servers>    
 </servers>

</host>

Note.
After a while I realized that you can not copy all JBoss installation directory to the other machine (remember, I cloned the machines). This leads to problems with HornetQ because you have several servers in the same network with the same ID.
This ID is created the first time a node starts and it is stored in some internal dir, so if you copied the whole directory you are also copying the ID.
In these cases you will see this WARN message:
12:32:54,418 WARN  [org.hornetq.core.client] (hornetq-discovery-group-thread-dg-group1) HQ212034: There are more than one servers on the network broadcasting the same node id. You will see this message exactly once (per node) if a node is restarted, in which case it can be safely ignored. But if it is logged continuously it means you really do have more than one node on the same network active concurrently with the same node id. This could occur if you have a backup node active at the same time as its live node. nodeID=374ca1fe-61d7-11e5-b4c6-75baa50d332e